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1 

DESCRIPTION 

Development of Novel Anti-Mtcrobtal Agents Based on Bacteriophage Genomics 

5 BACKGROUND OF THE INVEftTIQfl 

The present invention relates to the field of antibacterial agents and die 
treatment of infections of animals or other complex organisms by bacteria. 

10 The frequency and spectrum of antibiotic-resistant infections have, in recent 

years, increased in both the hospital and community. Certain infections have become 
essentially untreatable and are growing to epidemic proportions in the developing 
world as well as in institutional settings in the developed world. The staggering 
spread of antibiotic resistance in pathogenic bacteria has been attributed to microbial 

1 5 genetic characteristics, widespread use of antibiotic drugs, and changes in society that 
enhance the transmission of drug-resistant organisms. This spread of drug resistant 
microbes is leading to ever increasing morbidity, mortality and health-care costs. 

Ironically, it is the very success of antibiotics, resulting in their widespread 
use, that has contributed the most to rising numbers of drug resistant bacterial strains. 

20 The longer a bacterial strain is exposed to a drug, the more likely it is to acquire 
resistance. Today, a total of 160 antibiotics, all based on a few basic chemical 
structures and targeting a small number of metabolic pathways, have found their way 
to market Over-prescription of these drugs, as well as the failure of patients to 
comply with the complete antibiotic regimen, has lead to the rapid emergence of 

25 antibiotic resistant strains. Such misuse of prescriptions, careless use of antibiotics in 
virtually all commercial production of beef and fowl, and changing societal 
conditions, such as the growth of day-care centers, increased long-term care in 
hospitals, and increased mobility of the population, has provided an environment 
where drug-resistant microbes can emerge and spread. Thus, virtually all common 

30 infectious bacteria are becoming, or have already become, resistant to one or more 
groups of antibiotics. Such resistance now reaches all classes of antibiotics currently 
in use, including; p-lactams, fluoroquinolones, aminoglycosides, macrolide peptides, 
chloramphenicol, tetracyclines, rifampicin, folate inhibitors, glycopeptides, and - * * 
mupirocin. 

35 Over the last 45 years bacteria have adapted genetically to avoid the 

destruction/alteration of the essential pathways that these chemotherapeutic agents 
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target. Antibiotic resistant bacterial strains are now emerging at a higher rate than the 
rate at which new antibiotics are being developed. The consequence of this dilemma 
has been a dramatic increase in the cost of treating infections what would otherwise 
easily succumb to routine antibiotic therapy. Furthermore, and perhaps most 
5 importantly, the emergence of multiple drug resistant pathogenic bacteria has led to a 
significant increase in morbidity and mortality, particularly in institutional settings. 

Most major pharmaceutical companies have on-going drug discovery 
programs for novel anti-microbials. These are based on screens for small molecule 
inhibitors (natural products, bacterial culture media, libraries of small molecules, 

1 0 combinatorial chemistry) of crucial metabolic pathways of the micro-organism of 
interest (e.g. t bacteria, fungi, parasites, worms). The screening process is largely for 
cytotoxic compounds and in most cases is not based on a known mechanism of action 
of the compounds. Pharmaceutical companies have large programs in this area. 
Classical drug screening programs are being exhausted and many of these 

1 5 pharmaceutical companies are looking towards rational drug design programs. 
Several small to mid-size biotechnology companies as well as large 
pharmaceutical companies have developed systematic high-throughput sequencing 
programs to decipher the genetic code of specific micro-organisms of interest. The 
goal is to identify, through sequencing, unique biochemical pathways or intermediates 

20 that are unique to the microorganism. Knowledge of this may, in turn, form the 
rationale for a drug discovery program based on the mechanism of action of the 
identified enzymes/proteins. Genome Therapeutics Corp., The Institute for Genome 
Research, Human Genome Sciences Inc., and other companies have such sequencing 
programs in place, However, one of the most critical steps in this approach is the 

25 ascertainment that the identified proteins and biochemical pathways are 1) non- 
redundant and essential for bacterial survival, and 2) constitute suitable and accessible 
targets for drug discovery. 
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SUMMARY OF THE INVENTION 

While animals such as humans are, on occasion, infected by pathogenic 
bacteria, bacteria also have natural enemies. A number of host-specific viruses, 
5 known as bacteriophages or phages, infect and kill bacteria in the natural 
environment Such bacteriophages generally have small compact genomes and 
bacteria are their exclusive hosts. Many known bacteria are host to a large number of 
bacteriophages that have been described in the literature. During the 1940's - I960's, 
phage biology was an area of active research. As a testimony to this, the study of 

10 phages which infect and inhibit the enteric bacterium Escherichia coli (£. coli) 
contributed much to the early understanding of molecular biology and virology. 

As is generally understood, bacteriophage (or phages) are viruses that infect 
and kill bacteria. They are natural enemies of bacteria and, over the course of 
evolution, have developed proteins (products of DNA sequences) which enable them 

1 5 to infect a host bacteria, replicate their genetic material, usurp host metabolism, and 
ultimately kill their host The scientific literature well documents the fact that many 
known bacteria have a large number of such bacteriophages (Ackermann and DuBow, 
1987) that can infect and kill mem (for example, see the ATCC bacteriophage:) 
collection at http://www,atcc.org). < 

20 This invention utilizes the observation that bacteriophages successfully infect 

and inhibit or kill host bacteria, targeting a variety of normal host metabolic and 
physiological traits, some of which are shared by all bacteria, pathogenic and 
nonpathogenic alike. The term "pathogenic" as used herein denotes a contribution to 
or implication in disease or a morbid state of an infected organism. The invention 

25 thus involves identifying and elucidating the molecular mechanisms by which phages 
interfere with host bacterial metabolism, an objective being to provide novel targets 
for drug design. Whether the phage blocks bacterial RNA transcription or translation, 
or attacks other important metabolic pathways, such as cell wall assembly or 
membrane integrity, the basic blueprint for a phage's bacteria-inhibiting ability is 

30 encoded in its genome and can be unlocked using bioinformatics, functional 
genomics, and proteomics. By these means, the invention utilizes sequence 
information from the genomics of bacteriophage to identify novel antimicrobials that 
can be further used to actively and/or prophylactically treat bacterial infection. 

Two important components of the invention thus are: i) the identification of 

35 bacteria-inhibiting phage open reading frames ("ORF's) and corresponding products 
that can be used to develop antibiotics based on amino acid sequence and secondary 
structural characteristics of the ORF products, and ii) the use of bacteriophages to map 
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out essential bacterial target genes and homologs, which can in turn lead to the 
development of suitable anti-microbial agents. These two avenues represent new and 
general methods for developing novel antimicrobials. 

The invention thus concerns the identification of bacteriophage ORFs that 
5 supply bacteria-inhibiting functions. In this regard, use of the terms "inhibit", 
"inhibition", "inhibitory", and "inhibitor" all refer to a function of reducing a 
biological activity or function. Such reduction in activity or function can, for 
example, be in connection with a cellular component, e,g, an enzyme, or in 
connection with a cellular process, e.g., synthesis of a particular protein, or in 

10 connection with an overall process of a cell, e.g., cell growth. In reference to bacterial 
cell growth, for example, an inhibitory effect (i.e., a bacteria-inhibiting effect) may be 
bacteriocidal (killing of bacterial cells) or bacteriostatic (i.e., stopping or at least 
slowing bacterial cell growth). The latter slows or prevents cell growth such that 
fewer cells of the strain are produced relative to uninhibited cells over a given period 

15 of time. From a molecular standpoint, such inhibition may equate with a reduction in 
the level of, or elimination of, the transcription and/or translation of a specific 
bacterial target(s), or reduction or elimination of activity of a particular target 
biomolecule. 

It is particularly advantageous to evaluate a plurality of different phage ORFs 

20 for inhibitory activity that may be from one, but is preferably from a plurality of 
different phage. For example, evaluating ORFs from a number of different phage of 
the same bacterial host provides at least two advantages. One is that the multiple 
phages will provide identification of a variety of different targets. Second, it is likely 
that multiple phage will utilize the same cellular target 

25 As used herein, the terms "bacteriophage" and "phage" are used 

interchangeably to refer to a virus which can infect a bacterial strain or a number of 
different bacterial strains. 

In the context of this invention, the term "bacteriophage ORF" or ""phage 
ORF" or similar term refers to a nucleotide sequence in or from a bacteriophage. In 

30 connection with a particular ORF, the terms refer an open reading frame which has at 
least 95% sequence identity, preferably at least 97% sequence identity, more 
preferably at least 98% sequence identity with an ORF from the particular phage 
identified herein (e.g., with an ORF as identified herein) or to a nucleic acid sequence 
which has the specified sequence identify percentage with such an ORF sequence. 

35 A first aspect of the invention thus provides a method for identifying a - ~~ 

bacteriophage nucleic acid coding region encoding a product active on an essential 
bacterial target by identifying a nucleic acid sequence encoding a gene product which 



WO 00/32825 



PCT/IB99/02040 



provides a bacteria-inhibiting function when the bacteriophage infects a host 
bacterium, preferably one that is an animal or plant pathogen, more preferably a bird 
or mammalian pathogen, and most preferably a human pathogen. The bacteriophage 
is an uncharacterized bacteriophage. Thus, the method excludes, for example, phage 
5 X, <J>x 1 74, m 1 3 and other £. cofr-specific bacteriophage that have been studied with 
respect to gene number and/or function. It also excludes, for example, the nucleic 
acid coding regions described in Tables 12-14, and in preferred embodiments, 
excludes the phage in which those regions are naturally located. 

In connection with bacteriophage, the term "uncharacterized*' means that a 

1 0 certain bacteriophage's genome has not yet been fully identified such that the genes 
having function involved in inhibiting host cells have not been identified. In 
particular, phage for which the description of genomic or protein sequence was first 
provided herein are uncharacterized. Phage sequences for which host bacteria- 
inhibiting functions have been identified prior to the filing of the present application 

1 5 (or alternatively prior to the present invention) are specifically excluded from the 
aspects involving utilisation of sequences from uncharacterized bacteriophage, except 
that aspects may involve a plurality of phage where one or more of those phage are 
uncharacterized and one or more others have been characterized to some extent. A 
number of different bacteria-inhibiting phage ORFs are indicated in Tables 1 1-14. 

20 The phage ORFs or sequences identified therein are not within the term 

"uncharacterized; alternatively, in preferred embodiments the phage containing those 
ORFs are excluded from this term. Further, any additional phage ORFs (or 
alternatively the phage which contain those ORFs) which have previously been 
described in the art as bacteria-inhibiting ORFs are expressly excluded; those ORFs or 

25 phage are known to those skilled in the art and the exclusion can be made express by 
specifically naming such ORFs or phage as needed (likewise for uncharacterized 
targets as described below). For the sake of brevity, such a listing is not expressly 
presented, as such information is readily available to those skilled in the art. 

Stating that an agent or compound is "active on" a particular cellular target, 

30 such as the product of a particular gene, means that the target is an important part of a 

cellular pathway which includes that target and that the agent acts on that pathway. 

Thus, in some cases the agent may act on a component upstream or downstream of the 

stated target, including on a regulator of that pathway or a component of that pathway. 
By "essential", in connection with a gene or gene product, is meant that the host 

35 cannot survive without, or is significantly growth compromised, in the afc>se"ffce 

depletion, or alteration of functional product. An "essential gene" is thus one that 

encodes a product that is beneficial, or preferably necessary, for cellular growth in 
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vitro in a medium appropriate for growth of a strain having a wild-type allele 
corresponding to the particular gene in question. Therefore, if an essential gene is 
inactivated or inhibited, that cell will grow significantly more slowly, preferably less 
than 20%, more preferably less than 10%, most preferably less than 5% of the growth 
5 rate of the uninhibited wild-type, or not at all, in the growth medium. Preferably, in 
the absence of activity provided by a product of the gene, the cell will not grow at all 
or will be non-viable, at least under culture conditions similar to the in vivo conditions 
normally encountered by the bacterial cell during an infection. For example, absence 
of the biological activity of certain enzymes involved in bacterial cell wail synthesis 
10 can result in the lysis of cells under normal osmotic conditions, even though 
protoplasts can be maintained under controlled osmotic conditions. In the context of 
the invention, essential genes are generally the preferred targets of antimicrobial 
agents. Essential genes can encode target molecules directly or can encode a product 
involved in the production, modification, or maintenance of a target molecule. 

1 5 A "target" refers to a biomolecule that can be acted on by an exogenous agent, 

thereby modulating, preferably inhibiting, growth or viability of a cell. In most cases 
such a target will be a nucleic acid sequence or molecule, or a polypeptide or protein. 
However, other types of biomolecules can also be targets, e.g. t membrane lipids and 
cell wall structural components. 

20 The term "bacterium" refers to a single bacterial strain, and includes a single" 

cell, and a plurality or population of cells of that strain unless clearly indicated to the-' 
contrary. In reference to bacteria or bacteriophage, the term "strain" refers to bacteria 
or phage having a particular genetic content.! The genetic content includes genomic 
content as well as recombinant vectors. Thus, for example, two otherwise identical 

25 bacterial cells would represent different strains if each contained a vector, e.g., a 
plasmid, with different phage ORF inserts. 

In preferred embodiments, the phage is Staphylococcus aureus phage 77, 3 A, 
96, or 44 AHJD, Enterococcus sp. phage 1 82, or Streptococcus pneumoniae phage 
Dp-1. 

30 In preferred embodiments, the phage is selected from. Preferred embodiments 

involve expressing at least one recombinant phage ORF(s) in a bacterial host followed 
by inhibition analysis of that host. Inhibition following expression of the phage ORF_ 
is indicative that the product of the ORF is active on an essential bacterial target. 
Such evaluation can be carried out in a variety of different formats, such as on a 

35 support matrix such as a solidified medium in a petri dish, or in liquid culture. 
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Preferably a plurality of phage ORFs are expressed in at least one bacterium. The 
plurality of phage ORFs can be from one or a plurality of phage. With respect to a 
single phage or at least one phage in a plurality of phages, the plurality of expressed 
ORFs preferably represents at least 10%, more preferably at least 20%, 40%, or 60%, 
5 still more preferably at least 80% or 90%, and most preferably at least 95% of the 
ORFs in the phage genome. Preferably, for a plurality of phage, the plurality of 
expressed ORFs preferably represents at least 10%, more preferably at least 20%, 
40%, or 60%, still more preferably at least 80% or 90%, and most preferably at least 
95% of the ORFs in the phage genome of each phage. The plurality of phage ORFs 

10 can be expressed in a single bacterium, or in a plurality of bacteria where one ORF is 
expressed in each bacterium, or in a plurality of bacteria where a plurality of ORFs are 
expressed in at least one or in all of the plurality of bacteria, or combinations of these. 

In embodiments of the above aspect (as well as in other aspects herein) in 
which a plurality of phage are utilized, a plurality of phage have the same bacterial 

1 5 host species; have different bacterial host species; or both. The plurality of phage 
includes at least two different phage, preferably at least 3,4,5,6,8,10,15,20, or more 
different phage. Indeed, more preferably, the plurality of phage will include 50, 75, 
100, or more phage. As described herein, the larger number of phage is useful to 
provide additional target and target evaluation information useful in developing 

20 antibacterial agents, for example, by providing identification of a larger range of 
bacterial targets, and/or providing further indication of the suitability of a particular 
target (for example, utilization of a target by a number of different unrelated phage 
can suggest that the target is particularly stable and accessible and effective) and/or 
can indicate alternate sites on a target which interact with different inhibitors. 

25 Further embodiments involve confirmation of the inhibitor function of the 

phage ORF, such as by utilizing or incorporating a control(s) designed to confirm the 
inhibitory nature of the ORF(s) being evaluated. The control can, for example, be 
provided by expression of an inactive or partially inactive form of the ORF or ORF 
product, and/or by the absence of expression of the ORF or ORF product in the same 

30 or a closely comparable bacterial strain as that used for expression of the test ORF. 
The reduced level of activity or the absence of active ORF product in the control will 
thus not provide the inhibition provided by a corresponding inhibitory ORF, or will 
provide a distinguishably lower level of inhibition. An inactivated or partially 
inactivated control has a mutation(s), e.g., in the coding region or in flanking 

35 regulatory elements, that reduce(s) or eliminate(s) the normal function of the ORF.~" 
Thus, the inhibition of a bacterium following expression of a phage ORF is 
determined by comparison with the effects of expression of an inactivated ORF or the 
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response of the bacteria in the absence of expression in the same or similar type 
bacterium. Such determination of inhibition of the bacterium following expression of 
the ORF is indicative of a bacteria-inhibiting function. These manipulations are 
routinely understood and accomplished by those of skill in the art using standard 
5 techniques. In embodiments utilizing absence of expression of the ORF, the bacteria 
can, for example, contain an empty vector or a vector which allows expression of an 
unrelated sequence which is preferably non-inhibitory. Alternatively, the bacteria 
may have no vector at all. Combinations of such controls or other controls may also 
be utilized as recognized by those skilled in the art. 

1 0 In embodiments involving expression of a phage ORF in a bacterial strain, in 

preferred embodiments that expression is inducible. 

By "inducible" is meant that expression is absent or occurs at a low level until 
the occurrence of an appropriate environmental stimulus provides otherwise. For the 
present invention such induction is preferably controlled by an artificial 

1 5 environmental change, such as by contacting a bacterial strain population with an 
inducing compound (/.e., an inducer). However, induction could also occur, for 
example, in response to build-up of a compound produced by the bacteria in the 
bacterial culture, e.g., in the medium. As uncontrolled or constitutive expression of 
inhibitory ORFs can severely compromise bacteria to the point of eradication, such 

20 expression is therefore undesirable in many cases because it would prevent effective 
evaluation of the strain and inhibitor being studied. For example, such uncontrolled 
expression could prevent any growth of the strain following insertion of a 
recombinant ORF, thus preventing determination of effective transfection or 
transformation. A controlled or inducible expression is therefore advantageous and is 

25 generally provided through the provision of suitable regulatory elements, e.g., 
promoter/operator sequences that can be conveniently transcriptionally linked to a 
coding sequence to be evaluated. In most cases, the vector will also contain 
sequences suitable for efficient replication of the vector in the same or different host 
cells and/or sequences allowing selection of cells containing the vector, i.e., 

30 "selectable markers." Further, preferred vectors include convenient primer sequences 
flanking the cloning region from which PCR and/or sequencing may be performed. 

As knowledge of the nucleotide sequence of phage ORFs is useful, e.g., for 
assisting in the identification of phage proteins active against essential bacterial host 
targets, preferred embodiments involve the sequencing of at least a portion of the. 

35 phage genome in combination with the above methods. This can be done either-belore 
or after or independent of expression and inhibition of the ORF in the. bacteria, and 
provides information on the nature and characteristics of the ORF. Such a portion is 



WO 00/32825 



PCT/1B99/02040 



preferably at least 10%, 20%, 40%, 80%, 90%, or 100% of the phage genome. For 
embodiments in which a plurality of phage are utilized, preferably each phage is 
sequenced to an extent as just specified. 

Such sequencing is preferably accompanied by computer sequence analysis to 
5 define and evaluate ORF(s), ORF products, structural motifs or functional properties 
of ORF products, and/or their genetic control elements. Thus, certain embodiments 
incorporate computer sequence analyses or nucleic acid and/or amino acid sequences. 
Further, existing data banks can provide phage sequence and product information 
which can be utilized for analysis and identification of ORFs in the sequence. 

1 0 Computer analysis may further employ known homologous sequences from other 
species that suggest or indicate conserved underlying biochemical functions) for the 
inhibitory or potentially inhibitory ORF sequence(s) being evaluated. This can 
include the sequences of signature motifs of identified classes of inhibitors. 

In the context of the phage nucleic acid sequences, e.g., gene sequences, of this 

15 invention, the terms "homolog" and "homologous" denote nucleotide sequences from 
different bacteria or phage strains or species or from other types of organisms that 
have significantly related nucleotide sequences, and consequently significantly related 
encoded gene products, preferably having related function. Homologous gene 
sequences or coding sequences have at least 70% sequence identity (as defined by the 

20 maximal base match in a computer-generated alignment of two or more nucleic acid 
sequences) over at least one sequence window of 48 nucleotides, more preferably at 
least 80 or 85%, still more preferably at least 90%, and most preferably at least 95%. 
The polypeptide products of homologous genes have at least 35% amino acid 
sequence identity over at least one sequence window of 18 amino acid residues, more 

25 preferably at least 40%, still more preferably at least 50% or 60%, and most 
preferably at least 70%, 80%, or 90%. Preferably, the homologous gene product is 
also a functional homolog, meaning that the homolog will functionally complement 
one or more biological activities of the product being compared. For nucleotide or 
amino acid sequence comparisons where a homology is defined by a % sequence 

30 identity, the percentage is determined using BLAST programs ( with default 
parameters (Altschul et aL, 1997, "Gapped BLAST and PSI-BLAST: a new 
generation of protein database search programs, Nucleic Acid Res. 25:3389-3402). 
Any of a variety of algorithms known in the art which provide comparable results can 
also be used, preferably using default parameters. Performance characteristics for „ . 

35 three different algorithms in homology searching is described in Salamov et al„ f999, 
"Combining sensitive database searches with multiple intermediates to detect distant 
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homologues." Protein Eng. 12:95-100. Another exemplary program package is the 
GCG™ package from the University of Wisconsin. 

Homologs may also or in addition be characterized by the ability of two 
complementary nucleic acid strands to hybridize to each other under appropriately 
5 stringent conditions. Hybridizations are typically and preferably conducted with 
probe-length nucleic acid molecules, preferably 20-100 nucleotides in length. Those 
skilled in the art understand how to estimate and adjust the stringency of hybridization 
conditions such that sequences having at least a desired level of complementarity will 
stably hybridize, while those having lower complementarity will not. For examples of 

1 0 hybridization conditions and parameters, see, e.g.,. Maniatis, T. et al. (1989} 

Molecular Cloning: A Laboratory Manual . Cold Spring Harbor University Press, Cold 
Spring, N.Y.; Ausubel, F.M. et al. (1994) Current Protocols in Molecular Biology . 
John Wiley & Sons, Secaucus, N.J. Homologs and homologous gene sequences may 
thus be identified using any nucleic acid sequence of interest, including the phage 

15 ORFs and bacterial target genes of the present invention. 

A typical hybridization, for example, utilizes, besides the labeled probe of 
interest, a salt solution such as 6xSSC (NaCl and Sodium Citrate base) to stabilize 
nucleic acid strand interaction, a mild detergent such as 0.5% SDS, together with 
other typical additives such as Denhardt's solution and salmon sperm DNA. The 

20 solution is added to the immobilized sequence to be probed and incubated at suitable 
temperatures to preferably permit specific binding while mininiizing nonspecific 
binding. The temperature of the incubations and ensuing washes is critical to the 
success and clarity of the hybridization. Stringent conditions employ relatively higher 
temperatures, lower salt concentrations, and/or more detergent than do non-stringent 

25 conditions. Hybridization temperatures also depend on the length, complementarity 
level, and nature (ie, M GC content") of the sequences to be tested. Typical stringent 
hybridizations and washes are conducted at temperatures of at least 40°C, while lower 
stringency hybridizations and washes are typically conducted at 37°C down to room 
temperature (-25°Q. One of skill in the art is aware that these conditions may vary 

30 according to the parameters indicated above, and that certain additives such as 
formamide and dextran sulphate may also be added to affect the conditions. 

By "stringent hybridization conditions" is meant hybridization conditions at 
least as stringent as the following: hybridization in 50% formamide, 5X SSC, 50 mM 
NaiyO„ pH 6.8, 0.5% SDS, 0.1 mg/mL sonicated salmon sperm DNA, and 5X 

35 Denhart's solution at 42°C overnight; washing with 2X SSC, 0. 1 % SDS at 45°G; and 
washing with 0.2X SSC, 0.1% SDS at 45°C. 
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In sequence comparison analyses, an ORF, or motif, or set of motifs in a 
bacteriophage sequence can be compared to known inhibitor sequences, e.g., 
homologous sequences encoding homologous inhibitors of bacterial function- 
Likewise, the analysis can include comparison with the structure of essential bacterial 
5 gene products, as structural similarities can be indicative of similar or replacement 
biological function. Such analysis can include the identification of a signature, or 
characteristic motifs) of an inhibitor or inhibitor class. 

Also, the identification of structural motifs in an encoded product, based on 
nucleotide or amino acid sequence analysis, can be used to infer a biochemical 

1 0 function for the product. A database containing identified structural motifs in a large 
number of sequences is available for identification of motifs in phage sequences. The 
database is PROSITE, which is available at www.expasy.ch/cgi~bin/scanprosite. The 
identification of motifs can, for example, include the identification of signature motifs 
for a class or classes of inhibitory proteins. Other such databases may also be used. 

1 5 In aspects and preferred embodiments described herein, in which a bacterium 

or host bacterium is specified, the bacterium or host bacterium is preferably selected 
from a pathogenic bacterial species, for example, one selected from Table 1. 
Preferably, an animal or plant pathogen is used. For animals, preferably the bacterium 
is a bird or mammalian pathogen, still more preferably a human pathogen. 

20 In aspects and preferred embodiments involving a bacteriophage or sequences 

from a bacteriophage, one or more bacteriophage are preferably selected from those 
listed in Table 1 . Those exemplary bacteriophge are readily obtained from the 
indicated sources. 

In some cases, it is advantageous to utilize phage with non-pathogenic host 
25 bacteria. The genome, structural motif, ORF, homolog, and other analyses described 
herein can be performed on such phage and bacteria. Such analysis provides useful 
information and compositions. The results of such analyses can also be utilized in 
aspects of the present invention to identify homologous ORFs, especially inhibitor 
ORFs in phage with pathogenic bacterial hosts. Similarly, identification of a target in 
30 a non-pathogenic host can be used to identify homologous sequences and targets in 
pathogenic bacteria, especially in genetically closely related bacteria. Those skilled in 
the art are familiar with bacterial genetic relationships and with how to determine 
relatedness based on levels of genomic identity or other measures of nucleotide 
sequence and/or amino acid sequence similarity, and/or other physical and culture 
35 characteristics such as morphology, nutritional requirements, or minimal media4o 
support growth. 
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Also in preferred embodiments, an embodiments of this aspect is combined 
with an embodiment of the following aspect. 

A related aspect of the invention provides methods for identifying a target for 
antibacterial agents by identifying the bacterial target(s) of at least one 
5 uncharacterized or untargeted inhibitor protein or RNA from a bacteriophage. Such 
identification allows the development of antibacterial agents active on such targets. 
Preferred embodiments for identifying such targets involve the identification of 
binding of target and phage ORF products to one another. The phage ORF products 
may be subportions of a larger ORF product that also binds the host target. In 

1 0 preferred embodiments, the phage protein or RNA is from an uncharacterized 
bacteriophage in Table 1. This aspect preferably includes the identification of a 
plurality of such targets in one or a plurality of different bacteria, preferably in one or 
a plurality of bacteria listed in Table 1 . 

In preferred embodiments of this aspect and other aspects of this invention 

15 involving particular phage ORFs or phage sequences, the ORF is Staphylococcus 
aureus phage 77 ORF 17, 19, 43, 102, 104, or 182 as identified in U.S. application 
09/407,804, S. aureus phage 44AHJD ORF 1, 9, or 12, Streptococcus pneumoniae 
phage Dp-1 ORF 001, 002, 004, 008, 010, 013, 016, 021, 029, 030, 038, or 041, or 
Enterococcus sp. phage 182 ORF 002, 008, or 014. 

20 As indicated for the above aspect, preferably the method involves the use of a 

plurality of different phage, and thus a plurality of different phage inhibitors and/or 
inhibitor ORFs. 

In addition to uncharacteized phage ORF products, it is also useful to identify 
the targets of phage ORF products which are known to be inhibitors of host bacteria, 
25 but where the target has not been identified. Thus, such inhibitors can likewise be 
utilized as "untargeted" inhibitor phage ORFs and ORF products, e.g., proteins or 
RNAs. 

In the context of inhibitor proteins or RNAs from a phage, the term 
"uncharacterized" means that a bacteria-inhibiting function for the protein has not 

30 previously been identified. Preferably, but not necessarily, the sequence of the protein 
or the corresponding coding region or ORF was not described in the art before the 
filing of the present application for patent (or alternatively prior to the present 
invention). Thus, this term specifically excludes any bacteria-inhibiting phage protein 
and its associated bacterial target which has been identified as inhibitory before the 

35 present invention or alternatively before the filing of the present application, for, 

example those identified in Tables 12-1 4 or otherwise identified herein. For example, 
from £. coli, phage T7 genes 0.7 and 2.0 target the host RNA polymerase, phage T4 
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gp55/gp33 alter the specificity of host RNA polymerase. The T4 regB gene product 
also targets the host translation apparatus. As with the uncharacterized bacteriophage 
ORFs or bacteriophage above, for such identified proteins, the sequences encoding 
those proteins are excluded from the uncharacterized inhibitor proteins. 
5 The term "fragment" refers to a portion of a larger molecule or assembly. For 

proteins, the term "fragment" refers to a molecule which includes at least 5 contiguous 
amino acids from the reference polypeptide or protein, preferably at least 8, 10, 12, 
15, 20, 30, 50 or more contiguous amino acids. In connection with oligo- or 
polynucleotides, the term "fragment" refers to a molecule which includes at least 15 

1 0 contiguous nucleotides from a reference polynucleotide, preferably at least 24, 30, 36, 
45, 60, 90, 150, or more contiguous nucleotides. 

Preferred embodiments involve identification of binding that include methods 
for distinguishing bound molecules, for example, affinity chromatography, 
immunoprecipitation, crosslinking, and/or genetic screen methods that permit 

1 5 proteinrprotein interactions to be monitored. One of skill in the art is familiar with 
these techniques and common materials utilized (see, e.g., Coligan, J. et al. (eds.) 
(1995) Current Protocols in Protein Science. John Wiley & Sons, Secaucus, N.J.). 

Genetic screening for the identification of proteinrprotein interactions typically 
involves the co-introduction of both a chimeric bait nucleic acid sequence (here, the 

20 phage ORF to be tested) and a chimeric target nucleic acid sequence that, when cc~ 
expressed and having affinity for one another in a host cell, stimulate reporter gene 
expression to indicate the relationship. A "positive" can thus suggest a potential 
inhibitory effect in bacteria. This is discussed in further detail in the Detailed 
Description section below. In this way, new bacterial targets can be identified that are 

25 inhibited by specific phage ORF products or derivatives, fragments, mimetics, or 
other molecules. 

Other embodiments involve the identification and/or utilization of mutant 
targets by virtue of their host's relatively unresponsive nature in the presence of 
expression of ORFs previously identified as inhibitory to the non-mutant or wild-type 

30 strain. Such mutants have the effect of protecting the host from an inhibition that 
would otherwise occur and indirectly allow identification of the precise responsible 
target for follow-up studies and anti-microbial development. In certain embodiments, 
rescue from inhibition occurs under conditions in which a bacterial target or mutant 
target is highly expressed. This is performed, for example, through coupling of the 

35 sequence with regulatory element promoters, e.g., as known in the art, which regulate 
expression at levels higher than wild-type, e.g., at a level sufficiently higher that the 
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inhibitor can be competitively bound to the highly expressed target such that the 
bacterium is detectably less inhibited. 

Identification of the bacterial target can involve identification of a phage- 
specific site of action. This can involve a newly identified target, or a target where the 
5 phage site of action differs from the site of action of a previously known antibacterial 
agent or inhibitor. For example* phage T7 genes 0.7 and 2.0 target the host RNA 
polymerase, which is also the cellular target for the antibacterial agent, rifampin. To 
the extent that a phage product is found to act at a different site than previously 
described inhibitors, aspects of the present invention can utilize those new, phage- 
10 specific sites for identification and use of new agents. The site of action can be 
identified by techniques well-known to those skilled in the art, for example, by 
mutational analysis, binding competition analysis, and/or other appropriate 
techniques. 

Once a bacterial host target protein or nucleic acid or mutant target sequence 

15 has been identified and/or isolated, it too can be conveniently sequenced, sequence 
analyzed (e.g., by computer), and the underlying gene(s), and corresponding translated 
product(s) further characterized. Preferred embodiments include such analysis and 
identification. Preferably such a target has not previously been identified as an 
appropriate target for antibacterial action. 

20 Certain embodiments include the identification of at least one inhibitory phage 

ORF or ORF product, e.g., as described for the above aspect, and thus are a 
combination of the two aspects. 

Additionally, the invention provides methods for identifying targets for 
antibacterial agents by identifying homologs of a bacterial target e.g„ S. aureus, 

25 Enterococcus faecal is or other Enterococci, and Streptococcus pneumoniae of a 

bacteriophage inhibitory ORF product. Such homologs may be utilized in the various 
aspects and embodiments described herein as describded for the host Enterococcus sp. 
for bacteriophage 182. 

Other aspects of the invention provide isolated, purified, or enriched specific 

30 phage nucleic acid and amino acid sequences, subsequences, and homologs thereof for 
phage selected from uncharacterized phage listed in Table 1, preferably from 
bacteriophage 77, 3 A, 96, 44AHJD {Staphylococcus aureus host bacterium), Dp-1 
(Streptococcus pneumoniae host), or 182 (Enterococcus host) or other phage listed in 
Table 1 for those bacteria. For example, such sequences do not include sequences , , 

35 identified in any of Tables 11-14. Nucleotide sequences of this aspect are at least" 15 
nucleotides in length, preferably at least 18, 21, 24, or 27 nucleotides in length, more 
preferably at least 30, 50, or 90 nucleotides in length. In certain embodiments, longer 
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nucleic acids are preferred, for example those of at least 120, 150, 200, 300, 600, 900 
or more nucleotides. Such sequences can, for example, be amplification 
oligonucleotides (e.g., PCR primers), oligonucleotide probes, sequences encoding a 
portion or all of a phage-encoded protein, or a fragment or all of a phage-encoded 
5 protein. In preferred embodiments, the nucleic acid sequence contains a sequence 
which is within a length range with a lower length as specified above, and an upper 
length limit which is no more than 50, 60, 70, 80, or 90% of the length of the 
corresponding full-length ORF. The upper length limit can also be expressed in terms 
of the number of base pairs of the ORF (coding region). In preferred embodiments, 
10 the nucleic acid sequence is from Staphylococcus aureus phage 77 ORF 17, 19, 43, 
102, 104, or 182 as identified in U.S. application 09/407,804, S. aureus phage 44 
AHJD ORF 1, 9, or 12, Streptococcus pneumoniae phage Dp-1 ORF 001, 002, 004, 
008, 010, 013, 016, 021, 029, 030, 038, or 041, or Enterococcus sp. phage 182 ORF 
002, 008, or 014. 

15 As it is recognized that alternate codons will encode the same amino acid for 

most amino acids due to the degeneracy of the genetic code, the sequences of this 
aspect includes nucleic acid sequences utilizing such alternate codon usage for one or 
more codons of a coding sequence. For example, all four nucleic acid sequences 
OCT, GCC, GCA, and GCG encode the amino acid, alanine. Therefore, if for an 

20 amino acid there exists an average of three codons, a polypeptide of 100 amino acids 
in length will, on average, be encoded by 3 100 , or 5 x 10 47 , nucleic acid sequences. 
Thus, a nucleic acid sequence can be modified (e.g., a nucleic acid sequence from a 
phage as specified above) to form a second nucleic acid sequence encoding the same 
polypeptide as encoded by the first nucleic acid sequence using routine procedures 

25 and without undue experimentation. Thus, all possible nucleic acid sequences that 
encode the specified amino acid sequences are also fully described herein, as if all 
were written out in full, taking into account the codon usage, especially that preferred 
in the host bacterium. The alternate codon descriptions are available in common 
texbooks, for example, Stryer, BIOCHEMISTRY 3 rd ed., and Lehninger, 

30 BIOCHEMISTRY 3 nJ ed., along wth many others. Codon preference tables for 
various types of organisms are available in the literature. Sequences with alternate 
codons at one or more sites can also be utilized in the computer-related aspects and 
embodiments herein. Because of the number of sequence variations involving 
alternate codon usage, for the sake of brevity, individual sequences are not separately 

35 listed herein. Instead the alternate sequences are described by reference to the netural 
sequence with replacement of one or more (up to all e.g„ up to 3, 5, 10, 15, 20, 30, 40, 
50, or more) of the degenerate codons with alternate codons from the alternate codon 
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table (Table 6), or a modified table applicable to a particular organism that has 
differing codon usage, preferably with selection according to preferred codon usage 
for the normal host organism or a host organism in which a sequence is intended to be 
expressed. Those skilled in the art also understand how to alter the alternate codons to 
5 be used for expression in organisms where certain codons code differently than shown 
in the "universal" codon table. 

For amino acid sequences or polypeptides, sequences contain at least 5 peptide- 
linked amino acid residues, and preferably at least 6, 7, 10, 15, 20, 30, or 40, amino 
acids having identical amino acid sequence as the same number of contiguous amino 

1 0 acid residues in a particular phage ORF product. In some cases longer sequences may 
be preferred, for example, those of at least 50, 60, 70, 80, or 100 amino acids in 
length. In preferred embodiments, the amino acid sequence contains a sequence which 
is within a length range with a lower length as specified above, and an upper length 
limit which is no more than 50, 60, 70, 80, or 90% of the length of the corresponding 

1 5 full-length ORF product. The upper length limit can also be expressed in terms of the 
number of amino acid residues of the ORF product. In preferred embodiments, the 
amino acid sequence or polypeptide has bacteria-inhibiting function when expressed 
or otherwise present in a bacterial cell which is a host for the bacteriophage from 
which the sequence was derived. 

20 By "isolated" in reference to a nucleic acid is meant that a naturally occurring 

sequence has been removed from its normal cellular (e.g., chromosomal) environment 
or is synthesized in a non-natural environment {e.g., artificially synthesized). Thus, 
the sequence may be in a cell-free solution or placed in a different cellular 
environment. The term does not imply that the sequence is the only nucleotide chain 

25 present, but that it is essentially free (about 90-95% pure at least) of non-nucleotide 
material naturally associated with it, and thus is distinguished from isolated 
chromosomes. 

The term "enriched" means that the specific DNA or RNA sequence 
constitutes a significantly higher fraction (2-5 fold) of the total DNA or RNA present 

30 in the cells or solution of interest than in normal or diseased cells or in cells from 
which the sequence was originally taken. This could be caused by a person by 
preferential reduction in the amount of other DNA or RNA present, or by a 
preferential increase in the amount of the specific DNA or RNA sequence, or by a 
combination of the two. However, it should be noted that enriched does not imply. 

35 that there are no other DNA or RNA sequences present, just that the relative amount 
of the sequence of interest has been significantly increased. 
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The term "significant" is used to indicate that the level of increase is useful to 
the person making such an increase and an increase relative to other nucleic acids of 
about at least 2 -fold, more preferably at least 5- to 1 0-fold or even more. The term 
also does not imply that there is no DNA or RNA from other sources. The other 
5 source DNA may, for example, comprise DNA from a yeast or bacterial genome, or a 
cloning vector such as pUC19. This term distinguishes from naturally occurring 
events, such as viral infection, or tumor type growths, in which the level of one 
mRNA may be naturally increased relative to other species of mRNA. That is, the 
term is meant to cover only those situations in which a person has intervened to 

1 0 elevate the proportion of the desired nucleic acid. 

It is also advantageous for some purposes that a nucleotide sequence be in 
purified form. The term "purified" in reference to nucleic acid does not require 
absolute purity (such as a homogeneous preparation). Instead, it represents an 
indication that the sequence is relatively more pure than in the natural environment 

15 (compared to the natural level, this level should be at least 2-5 fold greater, e.g. y in 
terms of mg/mL). Individual clones isolated from a cDNA library may be purified to 
electrophoretic homogeneity. The claimed DNA molecules obtained from these 
clones could be obtained directly from total DNA or from total RNA. The cDNA 
clones are not, naturally occurring, but rather are preferably obtained via manipulation 

20 of a partially purified naturally occurring substance (messenger RNA). The 
construction of a cDNA library from mRNA involves the creation of a synthetic 
substance (cDNA) and pure individual cDNA clones can be isolated from the 
synthetic library by clonal selection of the cells carrying the cDNA library. Thus, the 
process which includes the construction of a cDNA library from mRNA and isolation 

25 of distinct cDNA clones yields an approximately 10*-fold purification of the native 
message. Thus, purification of at least one order of magnitude, preferably two or 
three orders, and more preferably four or five orders of magnitude is expressly 
contemplated. 

The terms "isolated", "enriched", and "purified" as respect nucleic acids, 
30 above, may similarly be used to denote the relative purity and abundance of 

polypeptides ( multimers of amino acids joined one to another by a-carboxyl:a-amino 
group (peptide) bonds). These, too, may be stored in, grown in, screened in, and 
selected from libraries using biochemical techniques familiar in the art. Such 
polypeptides may be natural, synthetic or chimeric and may be extracted using any of 
35 a variety of methods, such as antibody immunoprecipitation, other lagging" - 
techniques, conventional chromatography and/or electrophoretic methods. Some of 
the above utilize the corresponding nucleic acid sequence. 
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As indicated above, aspects and embodiments of the invention are not limited 

to entire genes and proteins. The invention also provides and utilizes fragments and 

portions thereof, preferably those which are "active" in the inhibitory sense described 

above. Such peptides or oligopeptides and oligo or polynucleotides have preferred 

5 lengths as specified above for nucleic acid and amino acid sequences from phage; 

corresponding recombinant constructs can be made to express the encoded same. 

Also included are homologous sequences and fragments thereof. 

Nucleic acid sequences of the present invention can be isolated using a method 

similar to those described herein or other methods known to those skilled in the art. 

1 0 In addition, such nucleic acid sequences can be chemically synthesized by well- 
known methods. Also, by having particular phage ORFs, e.g., the phage ORFs 
identified herein (e.g., anti-bacterial ORFs of the present invention, portions thereof, 
or oligonucleotides derived therefrom as described), other antimicrobial sequences 
from other bacteriophage sources can be identified and isolated using methods 

1 5 described here or other methods, including methods utilizing nucleic acid 
hybridization and/or computer-based sequence alignment methods. 

The invention also provides bacteriophage antimicrobial DNA segments from 
other phages based on nucleic acids and sequences hybridizing to the presently 
identified inhibitory ORF under high stringency conditions or sequences that are 

20 highly homologous. The bacteriophage segment from a specific phage, e.g., an 
antimicrobial DNA segment, can be used to identify a related segment from another 
unrelated phage based on stringent conditions of hybridization or on being a homolog 
based on nucleic acid and/or amino acid sequence comparisons. As with identified 
inhibitory sequences, such homologous coding sequences and products can be used as 

25 antimicrobials, to construct active portions or derivatives, to construct 
peptidomimetics, and to identify bacterial targets. 

The nucleotide and amino acid sequences identified herein are believed to be 
correct, however, certain sequences may contain a small percentage of errors, e.g. y 1- 
5%. In the event that any of the sequences have errors, the corrected sequences can be 

30 readily provided by one skilled in the art using routine methods. For example, the 
nucleotide sequences can be confirmed or corrected by obtaining and culturing the 
relevant phage, and purifying phage genomic nucleic acids; A region or regions at *■ 
interest can be amplified, e.g., by PCR from the appropriate genomic template, using 
primers based on the described sequence. The amplified regions can then be 

3 5 sequenced using any of the available methods {e.g. , a dideoxy termination method). 



WO 00/32825 



PCT/IB99/02040 



This can be done redundantly to provide the corrected sequence or to confirm that the 
described sequence is correct. Alternatively, a particular sequence or sequences can 
be identified and isolated as an insert or inserts in a phage genomic library and 
isolated, amplified, and sequenced by standard methods. Confirmation or correction 
5 of a nucleotide sequence for a phage gene provides an amino acid sequence of the 
encoded product by merely reading off the amino acid sequence according to the 
normal codon relationships and/or expressed in a standard expression system and the 
polypeptide product sequenced by standard techniques. The sequences described 
herein thus provide unique identification of the corresponding genes, coding 

10 sequences, and other sequences, allowing those sequences to be used in the various 
aspects of the present invention. 

In other aspects, the invention provides recombinant vectors and cells 
harboring at least one of the phage ORFs or portion thereof, or bacterial target 
sequences described herein. As understood by those skilled in the art, vectors may be 

1 5 provided in different forms, including, for example, plasmids, cosmids, and virus- 
based vectors. See, e.g., Maniatis, T. et al. (1989} Modular doping: A Laboratory 
Manual. Cold Spring Harbor University Press, Cold Spring, N.Y.; See also, Ausubel, 
F.M. et ai. (eds.) (1994) Current Protocols in Molecular Biology . John Wiley & Sons, 
Secaucus, N.J. 

20 In preferred embodiments, the vectors will be expression vectors, preferably 

shuttle vectors that permit cloning, replication, and expression within bacteria. An 
"expression vector" is one having regulatory nucleotide sequences containing 
transcriptional and translational regulatory information that controls expression of the 
nucleotide sequence in a host cell. Preferably the vector is constructed to allow 

25 amplification from vector sequences flanking an insert locus. In certain embodiments, 
the expression vectors may additionally or altemativley support expression, and/or 
replication in animal, plant and/or yeast cells due to the presence of suitable 
regulatory sequences, e.g., promoters, enhancers, 3' stabilizing sequences, primer 
sequences, etc. In preferred embodiments, the promoters are inducible and specific 

30 for the system in which expression is desired, e.g. , bacteria, animal, plant, or yeast. 
The vectors may optionally encode a "tag" sequence or sequences to facilitate protein 
purification. Convenient restriction enzyme cloning sites and suitable selective 
markers) are also optionally included. Such selective markers can be, for example, 
antibiotic resistance markers or markers which supply an essential nutritive growth 

35 factor to an otherwise deficient mutant host, e.g., tryptophan, histidine, or leucjne*itt 
the Yeast Two-Hybrid systems described below. 
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The term "recombinant vector" relates to a single- or double-stranded circular 
nucleic acid molecule that can be transfected into cells and replicated within or 
independently of a cell genome. A circular double-stranded nucleic acid molecule can 
be cut and thereby linearized upon treatment with appropriate restriction enzymes. An 
5 assortment of nucleic acid vectors, restriction enzymes, and the knowledge of the 
nucleotide sequences cut by restriction enzymes are readily available to those skilled 
in the art. A nucleic acid molecule encoding a desired product can be inserted into a 
vector by cutting the vector with restriction enzymes and ligating the two pieces 
together. Preferably the vector is an expression vector, e.g., a shuttle expression 
10 vector as described above. 

By " recombinant cell'* is meant a cell possessing introduced or engineered 
nucleic acid sequences, e.g., as described above. The sequence may be in the form of 
or part of a vector or may be integrated into the host cell genome. Preferably the cell 
is a bacterial cell. 

15 In another aspect, the invention also provides methods for identifying and/or 

screening compounds "active on" at least one bacterial target of a bacteriophage 
inhibitor protein or RNA. Preferred embodiments involve contacting such a bacterial 
target or targets (e.g., bacterial target proteins) with a test compound, and determining 
whether the compound binds to or reduces the level of activity of the bacterial target 

20 (e.g., a bacterial target protein). Preferably this is done either in vivo (i.e., in a cell- 
based assay) or in vitro, e.g., in a cell-free system under approximately physiological 
conditions. 

The compounds that can be used may be large or small, synthetic or natural, 
organic or inorganic, proteinaceous or non-proteinaceous. In preferred embodiments, 
25 the compound is a peptidomimetic, as described herein, a bacteriophage inhibitor 
protein or fragment or derivative thereof, preferably an "active portion", or a small 
molecule. 

In preferred embodiments, the bacterial target is a target of a phage ORF 
identified herein, e.g., S. aureus phage 44AHJD ORF 1, 9, or 12, Streptococcus 

30 pneumoniae phage Dp-1 ORF 001, 002, 004, 008, 010, 013, 016, 021, 029, 030, 038, 
or 041, or Enterococcus sp. phage 182 ORF 002, 008, or 014. 

In particular embodiments, the methods include the identification of bacterial 
targets or the site of action of an inhibitor on a bacterial target as described above or 
otherwise described herein. 

35 In embodiments involving binding assays, preferably binding is to a fragment 

or portion of a bacterial target protein, where the fragment includes less than 90%, 
80%, 70%, 60%, 50%, 40%, or 30% of an intact bacterial target protein. Preferably, 
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. the at least one bacterial target includes a plurality of different targets of 
bacteriophage inhibitor proteins, preferably a plurality of different targets. The 
plurality of targets can be in or from a plurality of different bacteria, but preferably is 
from a single bacterial species. 
5 A "method of screening" refers to a method for evaluating a relevant activity 

or property of a large plurality of compounds (e.g., a bacteria-inhibiting activity), 
rather than just one or a few compounds. For example, a method of screening can be 
used to conveniently test at least 100, more preferably at least 1000, still more 
preferably at least 10,000, and most preferably at least 100,000 different compounds, 

1 0 or even more. 

In the context of this invention, the term "small molecule'* refers to 
compounds having molecular mass of less than 2000 Daltons, preferably less than 
1500, still more preferably less than 1000, and most preferably less than 600 Daltons. 
Preferably but not necessarily, a small molecule is not an oligopeptide. 

15 In a related aspect or in preferred embodiments, the invention provides a 

method of screening for potential antibacterial agents by determining whether any of a 
plurality of compounds, preferably a plurality of small molecules, is active on at least 
one target of a bacteriophage inhibitor protein or RNA. Preferred embodiments 
include those described for the above aspect, including embodiments which involve 

20 determining whether one or more test compounds bind to or reduce the level of 
activity of a bacterial target, and embodiments which utilize a plurality of different 
targets as described above. 

The identification of bacteria-inhibiting phage ORFs and their encoded 
products also provides a method for identifying an active portion of such an encoded 

25 product. This also provides a method for identifying a potential antibacterial agent by 
identifying such an active portion of a phage ORF or ORF product. In preferred 
embodiments, the identification of an active portion involves one or more of 
mutational analysis, deletion analysis, or analysis of fragments of such products. The 
method can also include determination of a 3-dimensional structure of an active 

30 portion, such as by analysis of crystal diffraction patterns. In further embodiments, 
the method involves constructing or synthesizing a peptidomimetic compound, where 
the structure of the peptidomimetic compound corresponds to the structure of the 
active portion. In this context, "corresponds" means that the peptidomimetic 
compound structure has sufficient similarities to the structure of the active portion that 

35 the peptidomimetic will interact with the same molecule as the phage protein and 
preferably will elicit at least one cellular response in common which relates to the 
inhibition of the cell by the phage protein. 



WO 00/32825 



PCT7IB99/G2040 



In preferred embodiments, the ORF or ORF product is or is derived or 
obtained from S. aureus phage 44AHJD ORF 1, 9, or 12, Streptococcus pneumoniae 
phage Dp-1 ORF 001, 002, 004, 008, 010, 013, 016, 021, 029, 030, 038, or 041, or 
Enterococcus sp. phage 182 ORF 002, 008, or 014 or product thereof. 
5 The methods for identifying or screening for compounds or agents active on a 

bacterial target of a phage-encoded inhibitor can also involve identification of a 
phage-specific site of action on the target 

Preferably in the methods for identifying or screening for compounds active 
on such a bacterial target, the target is uncharacterized; the target is from an 
1 0 uncharacterized bacterium from Table 1 ; the site of action is a phage-specfic site of 
action. 

Further embodiments include the identification of inhibitor phage ORFs and 
bacterial targets as in aspects above. 

An "active portion" as used herein denotes an epitope, a catalytic or regulatory 

1 5 domain, or a fragment of a bacteriophage inhibitor protein that is responsible for, or a 
significant factor in, bacterial target inhibition. The active portion preferably may be 
removed from its contiguous sequences and, in isolation, still effect inhibition. 

By "mimetic" is meant a compound structurally and functionally related to a 
reference compound that can be natural, synthetic, or chimeric. In terms of the present 

20 invention, a "peptidomimetic," for example, is a compound that mimics the activity- 
related aspects of the 3-dimensional structure of a peptide or polyeptide in a non- 
peptide compound, for example mimics the structure of a peptide or active portion of 
a phage- or bacterial ORF-encoded polypeptide. 

A related aspect provides a method for inhibiting a bacterial cell by contacting 

25 the bacterial cell with a compound active on a bacterial target of a bacteriophage 
inhibitor protein or RNA, where the target was uncharacterized. In preferred 
embodiments, the compound is such a protein, or a fragment or derivative thereof; a 
structural mimetic, e.g., a peptidomimetic, of such a protein or fragment; a small 
molecule; the contacting is performed in vitro, the contacting is performed in vivo in 

30 an infected or at risk organism, e.g. , an animal such as a mammal or bird, for example, 
a human, or other mammal described herein; the bacterium is selected from a genus 
and/or species listed in Table 1 ; the bacteriophage inhibitor protein is uncharacterized; 
the bacteriophage inhibitor protein is from an uncharacterized phage listed in Table 1; 
the phage inhibitor protein is from one of S. aureus phage 44AHJD ORF 1, 9, or 12, 

35 Streptococcus pneumoniae phage Dp-1 ORF 001, 002, 004, 008, 010, 013, 016,-OJf; 
029, 030, 038, or 041 , or Enterococcus sp. phage 1 82 ORF 002, 008, or 014. 
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In the context of targets in this invention, the term "uncharacterized" means 
that the target was not recognized as an appropriate target for an antibacterial agent 
prior to the filing of the present application or alternatively prior to the present 
invention. Such lack of recognition can include, for example, situations where the 
5 target and/or a nucleotide sequence encoding the target were unknown, situations 
where the target was known, but where it had not been identified as an appropriate 
target or as an essential cellular component, and situations where the target was 
known as essential but had not been recognized as an appropriate target due to a belief 
that the target would be inaccessible or otherwise that contacting the cell with a 

1 0 compound active on the target in vitro would be ineffective in cellular inhibition, or 
ineffective in treatment of an infection. Methods described herein utilizing bacterial 
targets, e.g., for inhibiting bacteria or treating bacterial infections, can also utilize 
"uncharacterized target sites", meaning that the target has been previously recognized 
as an appropriate target for an antibacterial agent, but where an agent or inhibitor of 

1 5 the invention is used which acts at a different site than that at which the previously 
utilized antibacterial agent, i.e., a phage-specific site. Preferably the phage-specific 
site has different functional characteristics from the previously utilized site. In the 
context of targets or target sites, the term "phage-specific" indicates that the target or 
site is utilized by at least one bacteriophage as an inhibitory target and is different 

20 from previously identified targets or target sites. 

In the context of this invention, the term "bacteriophage inhibitor protein" 
refers to a protein encoded by a bacteriophage nucleic acid sequence which inhibits 
bacterial function in a host bacterium. Thus, it is a bacteria-inhibiting phage product. 
In the context of this invention, the phrase "contacting the bacterial cell with a 

25 compound active on a bacterial target of a bacteriophage inhibitor protein" or 
equivalent phrases refer to contacting with an isolated, purified, or enriched 
compound or a composition including such a compound, but specifically does not rely 
on contacting the bacterial cell with an intact phage which encodes the compound. 
Preferably no intact phage are involved in the contacting. 

30 Related aspects provide methods for prophylactic or therapeutic treatment of a 

bacterial infection by administering to an infected, challenged or at risk organism a 
therapeutically or prophylactically effective amount of a compound active on a target 
of a bacteriophage inhibitor protein or RNA, or as described for the previous aspect. 
Preferably the bacterium involved in the infection or risk of infection produces the 

35 identified target of the bacteriophage inhibitor protein or alternatively produces^ ~~ 
homologous target compound. In preferred embodiments, the host organism is a plant 
or animal, preferably a mammal or bird, and more preferably, a human or other 
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mammal described herein. Preferred embodiments include, without limitation, those 
as described for the preceding aspect. 

Compounds useful for the methods of inhibiting, methods of treating, and 
pharmaceutical compositions can include novel compounds, but can also include 
5 compounds which had previously been identified for a purpose other than inhibition 
of bacteria. Such compounds can be utilized as described and can be included in 
pharmaceutical compositions. 

In preferred embodiments of this and other aspects of the invention utilizing 
bacterial target sequences of a bacteriophage inhibitory ORF product, the target 

10 sequence is encoded by a Staphylococcus nucleic acid coding sequence, preferably S. 
aureus, a Streptococcus nucleic acid coding sequence, preferably Streptococcus 
pneumoniae, or Enterococcus nucleic acid coding sequence. Possible target 
sequences are described herein by reference to sequence source sites. 

The amino acid sequence of a polypeptide target is readily provided by 

15 translating the corresponding coding region. For the sake of brevity, the sequences 
are not reproduced herein. For the sake of brevity, the sequences are described by 
reference to the GenBank entries instead of being written out in full herein. In cases 
where the TIGR or GenBank entry for a coding region is not complete, the complete 
sequence can be readily obtained by routine methods, e.g., by isolating a clone in a 

20 phage host genomic library, and sequencing the clone insert to provide the relevant 
coding region. The boundaries of the coding region can be identified by conventional 
sequence analysis and/or by expression in a bacterium in which the endogenous copy 
of the coding region has been inactivated and using subcloning to identify the 
functional start and stop codons for the coding region. 

25 In the context of nucleic acid or amino acid sequences of this invention, the 

term "corresponding'* indicates that the sequence is at least 95% identical, preferably 
at least 97% identical, and more preferably at least 99% identical to a sequence from 
the specified phage genome, a ribonucleotide equivalent, a degenerate equivalent 
(utilizing one or more degenerate codons), or a homologous sequence, where the 

30 homolog provides functionally equivalent biological function. 

By 'treatment" or "treating" is meant administering a compound or 
pharmaceutical composition for prophylactic and/or therapeutic purposes. The term 
"prophylactic treatment" refers to treating a patient or animal that is not yet infected 
but is susceptible to or otherwise at risk of a bacterial infection. The term "therapeutic 

35 treatment" refers to administering treatment to a patient already suffering from. ~~ 
infection. 
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The term "bacterial infection" refers to the invasion of the host organism, 
animal or plant, by pathogenic bacteria. This includes the excessive growth of bacteria 
which are normally present in or on the body of the organism, but more generally, a 
bacterial infection can be any situation in which the presence of a bacterial 
5 population(s) is damaging to a host organism. Thus, for example, an organism suffers 
from a bacterial population when excessive numbers of a bacterial population are 
present in or on the organism's body, or when the effects of the presence of a bacterial 
population(s) is damaging to the cells, tissue, or organs of the organism. 

The terms "administer", "administering", and "administration" refer to a 

1 0 method of giving a dosage of a compound or composition, e.g., an antibacterial 
pharmaceutical composition, to an organism. Where the organism is a mammal, the 
method is, e.g., topical, oral, intravenous, transdermal, intraperitoneal, intramuscular, 
or intrathecal. The preferred method of administration can vary depending on various 
factors, e.g., the components of the pharmaceutical composition, the site of the 

1 5 potential or actual bacterial infection, the bacterium involved, and the infection 
severity. 

The term "mammal" has its usual biological meaning referring to any 
organism of the Class Mammalia of higher vertebrates that nourish their young with 
milk secreted by mammary glands, e.g., mouse, rat, and, in particular, human, bovine, 

20 sheep, swine, dog, and cat. 

In the context of treating a bacterial infection a "therapeutically effective 
amount" or "pharmaceutically effective amount" indicates an amount of an 
antibacterial agent, e.g., as disclosed for this invention, which has a therapeutic effect. 
This generally refers to the inhibition, to some extent, of the normal cellular 

25 functioning of bacterial cells that renders or contributes to bacterial infection. 
The dose of antibacterial agent that is useful as a treatment is a 
"therapeutically effective amount." Thus, as used herein, a therapeutically effective 
amount means an amount of an antibacterial agent that produces the desired 
therapeutic effect as judged by clinical trial results and/or animal models. This amount 

30 can be routinely determined by one skilled in the art and will vary depending on 
several factors, such as the particular bacterial strain involved and the particular 
antibacterial agent used. 

In connection with claims to methods of inhibiting bacteria and therapeutic or 
prophylactic treatments, "a compound active on a target of a bacteriophage inhibitor 

35 protein" or terms of equivalent meaning differ from administration of or contactwlth 
an intact phage naturally encoding the full-length inhibitor compound. While an 
intact phage may conceivably be incorporated in the present methods, the method at 
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least includes the use of an active compound as specified different from a full length 
inhibitor protein naturally encoded by a bacteriophage and/or a delivery or contacting 
method different from administration of or contact with an intact phage encoding the 
full-length protein. Similarly, pharmaceutical compositions described herein at least 
5 include an active compound different from a full-length inhibitor protein naturally 
encoded by a bacteriophage or such a full-length protein is provided in the 
composition in a form different from being encoded by an intact phage. Preferably 
the methods and compositions do not include an intact phage. 

In accord with the above aspects, the invention also provides antibacterial 

1 0 agents and compounds active on bacterial targets of bacteriophage inhibitor proteins 
or RNAs, where the target was uncharacterized as indicated above. As previously 
indicated, such active compounds include both novel compounds and compounds 
which had previously been identified for a purpose other than inhibition of bacteria. 
Such previously identified biologically active compounds can be used in 

1 5 embodiments of the above methods of inhibiting and treating. In preferred 

embodiments, the targets, bacteriophage, and active compound are as described herein 
for methods of inhibiting and methods of treating. Preferably the agent or compound 
is formulated in a pharmaceutical composition which includes a pharmaceutically 
acceptable carrier, excipient, or diluent. In addition, the invention provides agents, 

20 compounds, and pharmaceutical compositions where an active compound is active on 
an uncharacterized phage-specific site. 

In preferred embodiments, the target is as described for embodiments of 
aspects above. 

Likewise, the invention provides a method of making an antibacterial agent. 

25 The method involves identifying a target of a bacteriophage inhibitor polypeptide or 
protein or RNA, screening a plurality of compounds to identify a compound active on 
the target, and synthesizing the compound in an amount sufficient to provide a 
therapeutic effect when administered to an organism infected by a bacterium naturally 
producing the target. In preferred embodiments, the identification of the target and 

30 identification of active compounds include steps or methods and/or components as 
described above (or otherwise herein) for such identification. Likewise, the active 
compound can be as described above, including fragments and derivatives of phage 
inhibitor proteins, peptidomimetics, and small molecules. As recognized by those 
skilled in the art, peptides can be synthesized by expression systems and purified, or 

35 can be synthesized artificially. In preferred embodiments the inhibitory phage ORF" 
products is from S. aureus phage 44AHJD ORF 1, 9, or 12, Streptococcus 
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pneumoniae phage Dp- 1 ORF 001, 002, 004, 008, 010, 013, 016, 021, 029, 030, 038, 
or 041, or Enterococcus sp. phage 182 ORF 002, 008, or 014. 

As indicated above, sequence analysis of nucleotide and/or amino acid 
sequences can beneficially utilize computer analysis. Thus, in additional aspects the 
5 invention provides computer-related hardware and media and methods utilizing and 
incorporating sequence data from uncharacterized phage, e.g., uncharacterized phage 
listed in Table 1, preferably at least one of Staphylococcus aureus phage S. aureus 
phage 44AHJD ORF 1, 9, or 12, Streptococcus pneumoniae phage Dp-1 ORF 001, 
002, 004, 008, 010, 013, 016, 021, 029, 030, 038, or 041, or Enterococcus sp. phage 

10 182 ORF 002, 008, or 014, or 44 AHJD, Enterococcus sp. phage 182, or 

Streptococcus pneumoniae phage Dp-1. In general, such aspects can facilitate the 
above-described aspects. Various embodiments involve the analysis of genetic 
sequence and encoded products, as applied to the evaluating bacteriophage inhibitor 
ORFs and compounds and fragments related thereto. The various sequence analyses, 

1 5 as well as function analyses, can be used separately or in combination, as well as in 
preceding aspects and embodiments. Use in combination is often advantageous as the 
additional information allows more efficient prioritizing of phage ORFs for 
identification of those ORFs that provide bacteria-inhihiting function. 

In one aspect, the invention provides a computer-readable device which 

20 includes at least one recorded amino acid or nucleotide sequence corresponding to one 
of the specified phage and a sequence analysis program for analyzing a nucleotide 
and/or amino acid sequence. The device is arranged such that the sequence 
information can be retrieved and analyzed using the analysis program. The analysis 
can identify, for example, homologous sequences or the indicated %s of the phage 

25 genome and structural motifs. Preferably the sequence includes at least 1 phage ORF 
or encoded product, more preferably at least 10%, 20%, 30%, 40%, 50%, 70%, 90%, 
or 100% of the genomic phage ORFs and/or equivalent cDNA, RNA, or amino acid 
sequences. Preferably the sequence or sequences in the device are recorded in a 
medium such as a floppy disk, a computer hard drive, an optical disk, computer 

30 random access memory (RAM), or magnetic tape. The program may also be recorded 
in such medium. The sequences can also include sequences from a plurality of 
different phage. 

In this context, the term "corresponding" indicates that the sequence is at least 
95% identical, preferably at least 97% identical, and more preferably at least 99% 
35 identical to a sequence from the specified phage genome, a ribonucleotide equivalent, 
a degenerate equivalent (utilizing one or more degenerate codons), or a homologous 
sequence, where the homolog provides functionally equivalent biological function. 
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Similarly, the invention provides a computer analysis system for identifying 
biologically important portions of a bacteriophage genome. The system includes a 
data storage medium, e.g., as identified above, which has recorded thereon a 
nucleotide sequence corresponding to at least a portion of at least one uncharacterized 
5 bacteriophage genome, a set of program instructions to allow searching of the 
sequence or sequences to analyze the sequence, and an output device where the 
portion includes at least the sequence length as specified in the preceding aspect. The 
output device is preferably a printer, a video display, or a recording medium. More 
one than one output device may be included. For each of the present computer-related 

1 0 asepcts, the bacteriophage are preferably selected from the uncharacterized phage 
listed in Table i, more preferably from bacteriophage 77, 3 A, 96, 44 AHJD (S. 
aureus), Dp-1 {Streptococcus pneumoniae), or 182 {Enterococcus). 

In keeping with the computer device aspects, the invention also provides a 
method for identifying or characterizing a bacteriophage ORF by providing a 

15 computer-based system for analyzing nucleotide or amino acid sequences, e.g., as 
describe above. The system includes a data storage medium which has recorded a 
sequences or sequences as described for the above devices, a set of instructions as in 
the preceding aspect, and an output device as in the preceding aspect. The method 
further involves analyzing at least one sequence, and outputting the analysis results to 

20 at least one output device. 

In preferred embodiments, the analysis identifies a sequence similarity or 
homology with a sequence or sequences selected from bacterial ORFs encoding 
products with related biological function; ORFs encoding known inhibitors; and 
essential bacterial ORFs. Preferably the analysis identifies a probable biological 

25 function based on identification of structural elements or characteristic or signature 
motifs of an encoded product or on sequence similarity or homology. Preferably the 
uncharacterized bacteriophage is from Table 1, more preferably at least one of 
bacteriophage 77, 3 A, 96, 44 AHJD (5. aureus), Dp-1 (Streptococcus pneumoniae), or 
182 (Enterococcus). In preferred embodiments, the method also involves determining 

30 at least a portion of the nucleotide sequence of at least one uncharacterized 

bacteriophage as indicated, and recording that sequence on data storage medium of the 
computer-based system. In preferred embodiments, the analysis identifies a sequence 
similarity of homology with a 5. aureus phage 44AHJD ORF 1, 9, or 12, 
Streptococcus pneumoniae phage Dp-1 ORF 001, 002, 004, 008, 010, 013, 016, 021, 

35 029, 030, 038, or 041, or Enterococcus sp. phage 182 ORF 002, 008, or 014. - "~ 
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As used in the claims to describe the various inventive aspects and 
embodiments, "comprising" means including, but not limited to, whatever follows the 
word "comprising". Thus, use of the term "comprising" indicates that the listed 
elements are required or mandatory, but that other elements are optional and may or 
5 may not be present. By "consisting of is meant including, and limited to, whatever 
follows the phrase "consisting of. Thus, the phrase "consisting of indicates that the 
listed elements are required or mandatory, and that no other elements may be present. 
By "consisting essentially of* is meant including any elements listed after the phrase, 
and limited to other elements that do not interfere with or contribute to the activity or 

10 action specified in the disclosure for the listed elements. Thus, the phrase "consisting 
essentially of indicates that the listed elements are required or mandatory, but that 
other elements are optional and may or may not be present depending upon whether or 
not they affect the activity or action of the listed elements. 

Further embodiments will be apparent from the following Detailed Description 

1 5 and from the claims. 



BRIEF DESCRIPTION OF THE DRAWINGS 

20 FIGURE 1 A and IB are flow schematics showing the manipulations used to 

convert pT0021, an arsenite inducible vector containing the luciferase gene, into 
pTHA or pTM, two ars inducible vectors. Vector pTHA contains BamH I, Sal I, and 
Hind III cloning sites and a downstream HA epitope tag. Vector pTM contains Bam 
HI and Hind III cloning sites and no HA epitope tag. 

25 

FIGURE 2 is a schematic representation of the cloning steps involved to place 
the DNA segments of any of ORFs 17/ 19/ 43/ 102/104/182 or other sequences into 
pTHA to assess inhibitory potential. For subcloning into pTM or pT0021, Individual 
ORFs were amplified by the PCR using oligonucleotides targeting the ATG and stop 
30 codons of the ORFs. Using this strategy, Bam HI and Hind IH sites were positioned 
immediately upstream or downstream, respectively of the start and stop codons of 
each ORF. Following digestion with Bam HI and Hind III, the PCR fragments were. . 
subcloned into the same sites of pT0021 or pTM. Clones were verified by PCR~~and 
direct sequencing. 
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FIGURE 3 shows a schematic representation of the functional assays used to 
characterize the bactericidal and bacteriostatic potential of all predicted ORf s (>33 
amino acids) encoded by bacteriophage 77. Fig. 3A) Functional assay on semi-solid 
5 support media. Fig. 3B) Functional assay in liquid culture. 



FIGURE 4 A, B, and C is a bar graph showing the results of a screen in liquid 
media to assess bacteriostatic or bactericidal activity of 93 predicted ORFs (>33 
amino acids) encoded by bacteriophage 77. Growth inhibition assays were performed 

10 as detailed in the Detailed Description. The relative growth of Staphylococcus aureus 
transformants harboring a given bacteriophage 77 ORF (identified on the bottom of 
the graph), in the absence or presence of arsenite, is plotted relative to growth of a 
Staphylococcus aureus transformant containing ORF 5, a non-toxic bacteriophage 77 
ORF (which is set at 100%). Each bar represents the average obtained from three 

1 5 Staph A transformants grown in duplicate. Bacteriophage 77 ORFs showing 
significant growth inhibition consist of ORFs 17, 19, 102, 104, and 182. 



FIGURE 5 shows a block diagram of major components of a general purpose 
computer. 

20 

FIGURE 6 shows an ORF map for Streptococcus pneumoniae bacteriophage 
Dp-1 showing the ORF identifiers, genomic locations, and orientations of the 85 
identified ORFs that were found to have ribosomal binding sites and thus are expected 
to be expressed. 

25 

FIGURE 7 shows a schematic representation of the arsenite-inducible 
expression system present in a shuttle vector designed to express individual 
Streptococcus bacteriophage Dp-1 ORFs in Streptococcus. Various modifications can 
be readily made to such a vector, or other vectors can be readily constructed to 
30 provide inducible expression of ORFs in a particular host bacterium using well-known 
techniques. 
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DETAILEP DESCRIPTION OF THE PREFERRED EMBODIMENTS 

The invention may be more clearly understood from the following description. 
5 The tables will first be briefly described. 

Table 1 is a listing of a large number of available bacteriophage that can be 
readily obtained and used in the present invention. 

Table 2 shows the complete nucleotide sequence of the genome of 
Staphylococcus aureus bacteriophage 77. 
1 0 Table 3 shows a list of all the ORFs from Bacteriophage 77 that were screened 

in the functional assay to identify those with anti-microbial activity. 

Table 4 shows the predicted nucleotide sequence, predicted amino acid 
sequence, and physiochemical parameters ofORF 17/ 19/43/ 102/ 104/ 182]. These 
include the primary amino acid sequence of the predicted protein, the average 
1 5 molecular weight, amino acid composition, theoretical pi, hydrophobicity map, and 
predicted secondary structure map. 

Table 5 shows homology search results. BLAST analysis was performed with 
ORFs 17/ 19/ 43/ 102/ 104/ 182 against NCBI non-redundant nucleotide and 
Swissprot databases. The results of this search indicate that: I) ORF 17 has no 
20 significant homology to any gene in the NCBI non-NCBI non-redundant nucleotide 
database, II) ORF 19 has significant homology to one gene in the NCBI non- 
redundant nucleotide database - the gene encoding ORF 59 of bacteriophage phi PVL, 
III) ORF 43 has significant homology to one gene in the NCBI non-redundant 
nucleotide database - the gene encoding ORF 39 of phi PVL, IV) ORF 102 has 
25 significant homology to one gene in the NCBI non-redundant nucleotide database - 
the gene encoding ORF 38 of phi PVL, V) ORF 104 has no significant homology to 
any gene in the NCBI non-redundant nucleotide database, VI) ORF 1 82 has 
significant homology to one gene in the NCBI non-redundant nucleotide database - 
the gene encoding ORF 39 of phi PVL. 
30 Table 6 is a table from Alberts et al., MOLECULAR BIOLOGY OF THE 

CELL 3 rd ed., showing the redundancy of the "universal" genetic code. 

Table 7 shows the complete nucleotide sequence of Staphylococcus aureus 
bacteriophage 3A. 
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Table 8 is a listing of the ORFs identified in Staphylococcus aureus 
bacteriophage 3 A. 

Table 9 shows the complete nucleotide sequence of Staphylococcus aureus 
bacteriophage 96. 

5 Table 10 is a listing of the ORFs identified in Staphylococcus aureus 

bacteriophage 96. 

Table 11 is a listing of sequences deposited in the NCBI public database 
(GeneBank) for bacteriophage listed in Table 1. 

Table 12 is a listing of phage which encode a known lysis function , including 
1 0 the identi fied lysis gene. 

Table 13 is a listing of bacteriophage which encode holin genes, where holin 
genes encode proteins which form pores and eventually enable other enzymes to kill 
the host bacterium. 

Table 14 is a listing of bacteriophage which encode kil genes. 

1 5 Table 1 5 is a list of Staphylococcus aureus sequences identified by accession 

number which may include sequences from genes coding for target sequences for the 

phage 77-encoded antimicrobial proteins or peptides. The sequences were obtained 

by searching GenBank for listings. 

Table 16 shows the nucleotide sequence of the genome of Staphylococcus 

20 aureus phage 44 AHJD. 

Table 17 lists and shows the sequence position of the 73 ORFs predicted to be 

encoded by Staphylococcus aureus bacteriophage 44 AHJD that are greater than 33 

amino acids. 

Table 18 shows the ORF sequences and putative amino acid sequences for the 
25 Staphylococcus aureus bacteriophage 44 AHJD ORFs greater than 33 amino acids. 

Table 19 shows the similarities in sequence identified between predicted 
Staphylococcus aureus bacteriophage 44 AHJD ORFs and sequences present in public 
databases. 

Table 20 shows the homology alignments between predicted Staphylococcus 
30 aureus bacteriophage 44AHJD ORFs and the corresponding protein sequences present 
in public sequence databases. 

Table 21 shows the complete nucleotide sequence of the genome of 
Enterococcus bacteriophage 182. 

Table 22 lists and shows the sequence position of the 80 ORFs identified in 
35 bacteriophage 1 82 and that are greater than 33 amino acids. 
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Table 23 shows the nucleotide and predicted amino acid sequence of all 80 
ORFs identified in bacteriophage 182. 

» Table 24 shows the similarities identified to date in sequence between 
Enterococcus phage 182 ORFs greater than 33 amino acids and sequences present in 
5 public sequence databases. 

Table 25 shows the predicted amino acid sequence as well as the predicted 
secondary structures map for two Enterococcus bacteriophage 182 ORFs. 

Table 26 shows the homology alignments between predicted Enterococcus 
bacteriophage 182 ORFs and the corresponding protein sequences present in public 
10 sequence databases. 

Table 27 list Enterococcus sequences listed in GenBank providing possible 
Enterococcal target sequences for inhibitory Enterococcus bacteriophage 182 ORFs 
and other compounds with antibacterial activity. 

Table 28 shows the complete nucleotide sequence of the genome of 
1 5 Streptococcus bacteriophage Dp- 1 . 

Table 29 lists and shows sequence position of the 273 ORFs identified in 
Pneumococcal bacteriophage Dp-1 that are greater than 33 amino acids, 85 of which 
are predicted to be expressed in Dp-1 as having a ribosomal binding site. That set of 
85 ORFs is shown in the attached drawings. 
20 Table 30 shows the nucleotide and predicted amino acid sequence of all 273 

ORFs identified in bacteriophage Dp-1 that are identified as being expressed. 

Table 31 shows the similarities identified in sequence between Streptococcus 
phage Dp-1 ORFs greater than 33 amino acids and sequences present in public 
sequence databases. 

25 Table 32 shows the 473 1 bp sequence of Dp- 1 published by Sheehan et al., 

1997). 

Table 33 lists Streptococcus pneumoniae sequences listed in GenBank 
providing possible target sequences for inhibitory Streptococcus pneumoniae 
bacteriophage Dp-1 ORFs and other compounds with antibacterial activity 

30 

Background; 

As indicated above, the present invention is concerned, in part, with the use of 
bacteriophage coding sequences and the encoded polypeptides or RNA transcripts to _ 
identify bacterial targets for potential new antibacterial agents. Thus, the invention 
35 concerns the selection of relevant bacteria. Particularly relevant bacteria are those 
which are pathogens of a complex organism such as an animal, e.g., mammals, 
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reptiles, and birds, and plants. Examples include Stapylococcus aureus, Enterococcus 
species, and Streptococcus pneumoniae. However, the invention can be applied to 
any bacterium (whether pathogenic or not) for which bacteriophage are available or 
which are found to have cellular components closely homologous to components 
5 targeted by phage of another bacterium. 

Thus, the invention also concerns the bacteriophage which can infect a 
selected bacterium. Identification of ORFs or products from the phage which inhibit 
the host bacterium both provides an inhibitor compound and allows identification of 
the bacterial target affected by the phage-encoded inhibitor. Such targets are thus 

1 0 identified as potential targets for development of other antibacterial agents or 

inhibitors and the use of those targets to inhibit those bacteria. As indicated above, 
even if such a target is not initially identified in a particular bacterium, such a target 
can still be identified if a homologous target is identified in another bacterium. 
Usually, but not necessarily, such another bacterium would be a genetically closely 

1 5 related bacterium. Indeed, in some cases, a phage-encoded inhibitor can also inhibit 
such a homologous bacterial cellular component. 

The demonstration that bacteriophage have adapted to inhibiting a host 
bacterium by acting on a particular cellular component or target provides a strong 
indication that that component is an appropriate target for developing and using 

20 antibacterial agents, e.g., in therapeutic treatments. Thus, the present invention 
provides additional guidance over mere identification of bacterial essential genes, as 
the present invention also provides an indication of accessability of the target to an 
inhibitor, and an indication that the target is sufficiently stable over time {e.g., not 
subject to high rates of mutation) as phage acting on that target were able to develop 

25 and persist. Thus, the present invention identifies a subset of essential cellular 

components which are particularly likely to be appropriate targets for development of 
antibacterial agents. 

The invention also, therefore, concerns the development or identification of 
inhibitors of bacteria, in addition to the phage-encoded inhibitory proteins (or RNA 

30 transcripts), which are active on the targets of bacteriophage-encoded inhibitors. As 
described herein, such inhibitors can be of a variety of different types, but are 
preferably small molecules. 

The following description provides preferred methods for use in the various 
aspects of the invention. However, as those skilled in the art will readily recognize, 

35 other approaches can be used to obtain and process relevant information. Thus-ihe~ 
invention is not limited to the specifically described methods. In addition, the 
following description provides a set of steps in a particular order. That series of steps 
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describes the overall development involved in the present invention. However, it is 
clear that individual steps or portions of steps may be usefully practiced separately, 
and, further, that certain steps may be performed in a different order or even bypassed 
if appropriate information is already available or is provided by other sources or 
5 methods. 

Rejecting and qrowjpg Phage, and Isolating PNA 

Conceptually, the first step involves selecting bacterial hosts of interest. 
Preferably, but not necessarily, such hosts will be pathogens of clinical importance. 

1 0 Alternatively, because bacteria all share certain fundamental metabolic and structural 
features, these features can be targeted for study in one strain, for example a 
nonpathogenic one, and extrapolated to similarly succeed in pathogenic ones. 
Nonpathogenic strains may also exhibit initial advantages in being not only less 
dangerous, but also, for example, in having better growth and culturing characteristics 

1 5 and/or better developed molecular biology techniques and reagents. Consequently, 
advantageously the invention provides the ability target virtually any bacteria, but 
preferably pathogenic bacteria, with antimicrobial compounds designed and/or 
developed using bacteriophage inhibitory proteins and peptides from phage with non- 
pathogenic and/or pathogenic hosts. 

20 We have selected Staphylococcus aureus, Streptococcus pneumoniae, various 

Enterococci, and Pseudomonas aeruginosa as initial exemplary pathogens. These 
bacteria are a major cause of morbidity and mortality in hospital-based infections, and 
the appearance of antibiotics resistance in all three organisms makes it increasingly 
difficult to treat benign infections involving these organisms. Such infections can 

25 include, for example, otitis media, sinusitis, and skin, and airway infections (Neu, 
H.C. (1992). Science 257, 1064-1073). However, the approach described below is 
clearly applicable to any human bacterial pathogens including but not restricted to 
Mycobacterium tuberculosis, Nesseria gonorrhoeae, Haemophilus influenza, 
Acinobacter, Escherichia coli, Shigella dysenteria, Streptococcus pyogenes, 

30 Helicobacter pylori, and Mycoplasma species. This invention can also be applied to 
the discovery of anti-bacterial compounds directed against pathogens of animals other 
than humans, for example, sheep, cattle, swine, dogs, cats, birds, and reptiles. 
Similarly, the invention is not limited to animals, but also applies to plants and plant 
pathogens. 

35 In general, the bacteria are grown according to standard methodologies - 

employed in the art, including solid, semi-solid or liquid culturing, which procedures 
can be found in or extrapolated from standard sources such as Maloy, S.R., Stewart, 
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V.J., and Taylor, R.K. Genetic Analysis of Pathogenic Bacteria (1996) Cold Spring 
Harbor Laboratory Press, or Maniatis, T. et al. (1 989} Molecular Cloning: A 
Laboratory Manual. Cold Spring Harbor University Press, Cold Spring, N.Y.; or 
Ausubel, F.M. et al. (1994) Current Protocols in Molecular Biology . John Wiley & 
5 Sons, Secaucus, N.J. Culture conditions are selected which are adapted to the 
particular bacterium generally using culture conditions known in the art as 
appropriate, or adaptations of those conditions. 

Nucleic acids within these bacteria can be routinely extracted through common 
procedures such as described in the above-referenced manuals and as generaHy known 
1 0 to those skilled in the art. Those nucleic acid stocks can then be used to practice the 
other inventive aspects described below. 

Selection and Growth o f Bacteriophage, and Isolation of DNA 

The second step involves assembling a group of bacteriophages (phage 

1 5 collection) for one or more of the targeted bacterial hosts. While the invention can be 
utilized with a single bacteriophage for a pathogen or other bacterium, it is preferable 
to utilize a plurality of phage for each bacterium, as comparisons between a plurality 
of such phage provides useful additional information. Non-limiting examples of 
phage and sources for some of the above-mentioned pathogenic bacteria are found in 

20 Table I. The criteria used to select such phages is that they are infectious for the 
microbe targeted, and replicate in, lyse, or otherwise inhibit growth of the bacterium 
in a measurable fashion. These phages can be very different from one another 
(representing different families), as judged by criteria such as morphology (head, tail, 
plate, etc.), and similarity of genome nucleotide sequence (cross-hybridization). Since 

25 such diverse bacteriophages are expected to block bacterial host metabolism and 
ultimately inhibit by a variety of mechanisms, their combined study will lead to the 
identification of different mechanisms by which the phages independently inhibit 
bacterial targets. Examples include degradation of host DNA (Parson K.A., and 
Snustad, D.P. (1975). J. Virol 15, 221-444) and inhibition of host RNA transcription 

30 (Severinova, E., Severinov, K. and Darst, S.A. (1998;. JMol Biol 279, 9-18). This, 
in turn, yields novel information on phage proteins that can inhibit the targeted 
microbe. As explained below, this 1) forms the basis of novel drug discovery efforts 
based on knowledge of the primary amino acid sequence of the phage inhibitor 
protein (e.g., peptide fragments or peptidomimetics) and/or 2) leads to the 

35 identification of bacterial biochemical pathways, the proteins of which are essentiaTor 
significant for survival of the targeted microbe, and which enzymatic steps or 
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chemical reactions can be targeted by classical drug discovery methods using 
molecular inhibitors, for example, small molecule inhibitors. 

Bacteriophage are generally either of two types, lytic or filamentous, meaning 
they either outright destroy their host and seek out new hosts after replication, or else 
continuously propogate and extrude progeny phage from the same host without 
destroying it. Regardless of the phage life cycle and type, preferred embodiments 
incorporate phage which impede cell growth in measurable fashion and preferably 
stop cell growth. To this end, lytic phage are preferred, although certain nonlytic 
species may also suffice, e.g., if sufficiently bacteriostatic. 

Various procedures that are commonly understood by those of skill in the art 
can be routinely employed to grow, isolate, and purify phage. Such procedures are 
exemplified by those found in such common laboratory aids such as Maloy, S.R., 
Stewart, V.J., and Taylor, R.K. Genetic Analysis of Pathogenic Bacteria (1996) Cold 
Spring Harbor Laboratory Press; Maniatis, T. et al. (1989) Molecular Cloning a 
Laboratory Manual. Cold Spring Harbor University Press, Cold Spring, N.Y.; and 
Ausubel, F.M. et al. (eds.) (1994) Current Protoc ols in Molecular Binlncry John 
Wiley & Sons, Secaucus, N.J. The techniques generally involve the culturing of 
infected bacterial cells that are lysed naturally and/or chemically assisted, for 
example, by the use of an organic solvent such as chloroform that destroys the host 
cells thereby liberating the phage within. Following this, the cellular debris is 
centrifuged away from the supernatant containing the phage particles, and the phage 
then subsequently and selectively precipitated out of the supernatant using various 
methods usually employing the use of alcohols and/or other chemical compounds 
such as polyethylene glycol (PEG). The resulting phage can be further purified using 
various density gradient/centrifugation methodologies. The resulting phage are then 
chemically lysed, thereby releasing their nucleic acids that can be conveniently 
precipitated out of the supernatant to yield a viral nucleic acid supply of the phage of 
interest. 

Exemplary bacteriophage are indicated in Table 1, along with sources where 
those phage may be obtained. 

Exemplary bacteria include the reference bacteria for the identified 
bacteriophage, available from the same sources. 

Ch aracterizing Bacteriophage Genomes for ORFs „ 

The third step involves systematically characterizing the genetic information 
contained in the phage genome. Within this genetic information is the sequence of all 
RNAs and proteins encoded by the phage, including those that are essential or 
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instrumental in inhibiting their host. This characterization is preferably done in a 
systematic fashion. For example, this can be done by first isolating high moiecuiar 
weight genomic DNA from the phage using standard bacterial lysis methods, followed 
by phage purification using density gradient ultracentrifugation, and extraction of 
5 nucleic acid from the purified phage preparation. The high molecular weight DNA is 
then analyzed to determine its size and to evaluate a proper strategy for its sequencing. 
The DNA is broken down into smaller size fragments by sonication or partial 
digestion with frequently cutting restriction enzymes such as Sau3 A to yield 
predominantly 1 to 2 kilobase length DNA, which DNA can then be resolved by gel 

1 0 electrophoresis followed by extraction from the gel. 

The ends of the fragments are enzymatically treated to render them suitable for 
cloning and the pools of fragments are cloned in a bacterial plasmid to generate a 
library of the phage genome. Several hundred of these random DNA fragments 
contained in the plasmid vector are isolated as clones after introduction into an 

1 5 appropriate bacterium, usually Escherichia coll They are then individually expanded 
in culture and the DNA from each individual clone is purified. The nucleotide 
sequences of the inserts of these clones are determined by standard automated or 
manual methods, using oligonucleotide primers located on either side of the cloning 
site to direct polymerase mediated sequencing (e.g., the Sanger sequencing method or 

20 a modification of that method). Other sequencing methods can also be used. 

The sequence of individual clones is then deposited in a computer, and 
specific software programs (for example, Sequencher™, Gene Codes Corp.) are used 
to look for overlap between the various sequences, resulting in ordering of contig 
sequences and ultimately providing the complete sequence of the entire bacteriophage 

25 genome (one such example is given in Table 2 for Staphylococcus aureus 
bacteriophage 77; others are also provided herein). This complete nucleotide 
sequence is preferably determined with a redundancy of at least 3- to 5-fold (number 
of independent sequencing events covering the same region) in order to minimize 
sequencing errors. 

30 Preferably, the bacterial strain used as a phage host should not possess any 

other innate plasmids, transposons, or other phage or incompatible sequences that 
would complicate or otherwise make the various manipulations and analyses more 
difficult. 

Commercially available computer software programs are used to translate the 
35 nucleotide sequence of the phage to identify all protein sequences encoded by the 
phage (hereafter called open reading frames or ORFs). (Customized software can 
clearly also be used.) As phages are known to transcribe their genome into RNA from 
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both strands, in both directions, and sometimes in more than one frame for the same 
sequence, this exercise is done for both strands and in all six possible reading frames. 
As evolutionary constraints have forced the phage to conserve all of its vital protein 
sequences in as small a genome as possible, it is straightforward to identify all the 
5 proteins encoded by the phage by simple examination of the 6 translation frames of 
the genome. Once these ORFs are identified, they are cataloged into a phage 
proteome database (Table 3 lists ORFs identified from phage 77; ORF lists are also 
provided for other exemplary phage). This analysis is preferably performed for each 
phage under study. The process of ORF identification can be varied depending on the 

1 0 desired results. For example, the minimum length for the putative encoded 

polypeptide can be varied, and/or putative coding regions that have an associated 
Shine-Dalgamo sequence can be selected In the case of phage 77 ORFs, such 
parameter adjustment was performed and resulted in the identification of ORFs as 
listed herein. Different parameters had resulted in the identification of the ORFs 

1 5 listed in the preceding U.S. Provisional Application 60/1 10,992, filed December 3, 
1998, which is hereby incorporated by reference in its entirety. 

Exemplary phage 77 ORFs identified in that provisional application and as 
identified herein are shown in the following table: 



ORF ID 


Genomic 


a.a. 


Start 


ORF ID 


Genomic 


a. a. 


Start 


from 


position 


size 


codon 


from 


position 


size 


codon 


60/110,992 








241/190 








77ORF016 


2369-24024 


251 


TTG 


77ORF017 


23269-23982 


237 


ATG 


77ORF019 


39845-40501 


218 


ATA 


77ORF019 


39851-40501 


216 


ATG 


77ORF050 


29268-29564 


98 


ATG 


770RF182 


29268-29564 


98 


ATG 


77ORF050 


29268-29564 


98 


ATG 


77ORF043 


29304-29564 


86 


ATG 


77ORF067 


34312-34551 


79 


CTG 


77ORF104 


34393-34551 


52 


ATG 


770RF146 


29051-29212 


53 


ATG 


77ORF102 


29051-29212 


53 


ATG 



20 

Identifying and Characterizing Inhibitory Phage ORFs 

The fourth step entails identifying the phage protein or proteins or RNA 
transcripts that have the ability to inhibit their bacterial hosts. This can be 

25 accomplished, for example, by either or both of two non-mutually exclusive methods. 
The first method makes use of bioinformatics. Over the past few years, a large amount 
of nucleotide sequence information and corresponding translated products have 
become available through large genome sequencing projects for a variety of 
organisms including mammals, insects, plants, unicellular eukaryotes (yeast ana* — 

30 fungi), as well as several bacterial genomes such as E. coli y Mycobacterium 
tuberculosis, Bacillus subtil is, Staphylococcus aureus and many others. Such 
sequences have been deposited in public databases (for example, non-redundant 
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sequence database at GenBank and SwissProt protein sequence database) 
(http://www.ncbi.nlm.nih.gov)) and can be freely accessed to compare any specific 
query sequence to those present in such databases. For example, GenBank contains 
over 1.6 billion nucleotides corresponding to 2.3 million sequence records. Several 
5 computer programs and servers (e.g. , TBLASTN) have been created to allow the rapid 
identification of homology between any given sequence from one organism to that of 
another present in such databases, and such programs are public and available free of 
charge. 

In addition, it has been well established that basic biochemical pathways can 

1 0 be conserved in very distant organisms (for example bacteria and man), and that the 
proteins performing the various enzymatic steps in these pathways are themselves 
conserved at the amino acid sequence level. Thus, proteins performing similar 
functions (e.g. DNA repair, RNA transcription, RNA translation) have frequently 
preserved key structural signatures, identifiable by similarities across regions of 

1 5 proteins (domains and motifs). The antimicrobials of the present invention will 
preferably target features and targets that are highly characteristic or conserved in 
microbes, and not higher organisms. 

Most genomes encode individual proteins or groups of proteins that can be 
assembled into protein families that have been evolutionarily conserved. Therefore, 

20 similarity between a new query sequence and that of a member of a protein family 
(reference sequences from public databases) can immediately suggest a biochemical 
function for the novel query sequence, which in our case is a phage ORF. 

The sequence homology between individual members of evolutionarily distant 
members of a protein family is usually not randomly distributed along the entire 

25 length of the sequence but is often clustered into "motifs" and "domains". These 
correspond to key three-dimensional folds that form key catalytic and/or regulatory 
structures that perform key biochemical function(s) for the group of proteins. 
Commercially available computer software programs can identify such motifs in a 
new query sequence, again providing functional information for the query sequence. 

30 Such structural and functional motifs have also been derived from the combined 
analysis of primary sequence databases (protein sequences) and protein structure 
databases (X-ray crystallography, nuclear magnetic resonance) using so-called 
'threading" methods (Rost B,i and Sander C. (\996)Ann. Rev. Biophy. BiomoL 
' Struct. 25, 113-136). 

35 Such motifs and folds are themselves deposited in public databases which can 

be directly accessed (for example, SwissProt database; 3D-ALI at EMBL, Heidelberg; 
PROSITE). This basic exercise leads to a structural homology map in which each of 
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the phage ORFs has been probed for such similarities, and where initial structural and 
functional hits are identified (selected examples of sequence homologies detected 
between individual ORFs from the genome of Staphylococcus aureus bacteriophage 
77 and sequences deposited in public databases are shown in Table 5 for ORFs ' 
5 17/19/43/102/104/182). 

This analysis can point out phage proteins with similarity to proteins from 
other phages (such as those for E. coli) playing an important role in the basic 
biochemical pathways of the phage (such as DNA replication, RNA transcription, 
tRNAs, coat protein and assembly). Selected examples of such proteins include 

10 integrase and capsid protein. Therefore, this analysis enables identification and 
elimination of non-essential ORFs as candidates for an inhibitor function, as well as 
the identification of (potentially) useful ones. 

In addition, this analysis can point out specific ORFs as possible inhibitor 
ORFs. For example these ORFs may encode proteins or enzymes that alter bacterial 

1 5 cell structure, metabolism or physiology, and ultimately viability. Examples of such 
proteins present in the genome of Staphylococcus aureus bacteriophage 77 include 
orfl4 (deoxyuridine triphosphatase from bacteriophage T5), and orf!5 (sialydase). 
(These ORF identifications are as listed in provisional application 60/1 10,992.) Other 
examples include ORFs 9 and 12 of S. aureus phage 44 AHJD, which encode the 

20 putative lysis functions found in many bacteriophages - a "holin" and an "amidase". 

In addition, it is well known that bacterial and eukaryotic viruses can usurp 
pathways from their host in order to use them to their advantage in blocking host 
cellular pathways upon infection. The phage can achieve this by 1) directly producing 
an inhibitor of a key host pathway (e.g. T7 gene 0.5 and 2), 2) directly producing a 

25 novel activity (e.g. T4 DNA polymerase), and 3) altering concentrations of cell 
components by producing similar functions (e.g. T4 transfer RNAs). The 
identification of sequence similarity between phage ORFs and bacterial host genome 
sequences will be highly indicative of such a mechanism. (Selected examples of such 
homologies are listed in Figure 4 of the provisional application 60/1 10,992 and 

30 include orf4 (homologous to autolysin), orf20 (hypothetical protein from 

Staphyloccus aureus) and orC9 (hypothetical protein from Staphyloccus aureus.)) 
These ORFs can be analyzed by a standard biochemical approach to directly test their 
inhibitor functions (e.g., as described below). 

Alternatively, a homology search may reveal that a given phage ORF is related 

35 to a protein present in the databases having an activity known to be inhibitory, (fi.gr 
inhibitor of host RNA polymerase by E. coli bacteriophage T7. Such a finding would 
implicate the phage ORF product in a related activity. This will also suggest that a 
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new antimicrobial could be derived by a mimetic approach (e.g.. peptidomimetic) 
imitating this function or by a small molecule inhibitor to the bacterial target of the 
phage ORF, or any steps in the relevant host metabolic pathway, e.g., high throughput 
screening of small molecule libraries. Selected examples of such similarity between 
5 ORFs of Staphyloccus aureus bacteriophage 77 and proteins with inhibitor functions 
for bacterial hosts are listed in Figure 4 of the provisional application 60/1 10,992. 
These include orf9 (similar to bacteriophage PI kilA function), and orf4 (autolysin of 
Staphylococcus aureus , amidase enzymatic activity). 

A reason for the biochemical study of individual ORFs for inhibitor function is 

1 0 that their expression or overexpression will block cellular pathways of the host, 
ultimately leading to arrest and/or inhibition of host metabolism. In addition, such 
ORFs can alter host metabolism in different ways, including modification of 
pathogenicity. Therefore, individual ORFs identified above are expressed, preferably 
overexpressed, in the host and the effect of this expression or overexpression on host 

1 5 metabolism and viability is measured. This approach can be systematically applied to 
every ORF of the phage, if necessary, and does not rely on the absolute identification 
of candidate ORFs by bioinformatics. Individual ORFs are resynthesized from the 
phage genomic DNA, e.g. t by the polymerase chain reaction (PGR), preferably using 
oligonucleotide primers flanking the ORF on either side. These single ORFs are 

20 preferably engineered so that they contain appropriate cloning sites at their extremities 
to allow their introduction into a new bacterial expression plasmid, allowing 
propagation in a standard bacterial host such as E. colt, but containing the necessary 
information for plasmid replication in the target microbe such as S. aureus (hereafter 
referred to as shuttle vector). Shuttle vectors and their use are well known in the art. 

25 Such shuttle vectors preferably also contain regulatory sequences that allow 

inducible expression of the introduced ORF. As the candidate ORF may encode an 
inhibitor function that will eliminate the host, it is beneficial that it not be expressed 
prior to testing for activity. Thus, screening for such sequences when expressed in a 
constitutive fashion is less likely to be successful when the inhibitor is lethal. In the 

30 exemplary inducible system presented in Figure 1 A, IB, 2, and 7, regulatory 
sequences from the ars operon of S. aureus are used to direct individual ORF 
expression in S. aureus (or other bacteria in which the ars system is functional). The 
ars operon encodes a series of proteins which normally mediate the extrusion of 
arsenite and other trivalent oxyanions from the cells when they are exposed to such 

35 toxic substances in their environment. The operon encoding this detoxifying _ 

mechanism is normally silent and only induced when arsenite-related compounds are 
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present. (Tauriainen, S. et al. (1997) App. Env. Microb., Vol. 63, No. 11, p. 4456- 
4461.) 

Therefore, individual phage ORFs can be expressed in S. aureus in an 
inducible fashion by adding to the culture medium non-toxic arsenite concentrations 
5 during the growth of individual S. aureus clones expressing such individual phage 
ORFs. Toxicity of the phage inhibitor ORF for the host is monitored by reduction or 
arrest of growth under induction conditions, as measured by optical density in liquid 
culture or after plating the induced cultures on solid medium. Subsequently, 
interference of the phage ORF with the host biochemical pathways ultimately leading 
1 0 to reduced or arrested host metabolism can be measured by pulse-chase experiments 
using radiolabeled precursors of either DNA replication, RNA transcription, or protein 
synthesis. Similar constructs can be made and used for other bacteria using well- 
known techniques. 

Those skilled in the art are familiar with a variety of other inducible systems 
15 which can also be used for the controlled expression of phage ORFs, including, for 
example, lactose (see e.g., Stratagene's LacSwitch™II system; La Jo 11a, CA) and 
tetracycline-based systems (see, e.g. Clontech's Tet On/Tet Off™ system; Palo Alto, 
CA). The arsenite-inducible system described is further depicted in Figures 1, 2 and 7. 
The selection or construction of shuttle vectors and the selection and use of 
20 inducible systems are well known and thus other shuttle vectors appropriate for other 
bacteria can be readily provided by those skilled in the art, e.g., for use in other 
bacterial species. 

Standard methodologies for expressing proteins from constructs, and isolating 
and manipulating those proteins, for example in cross-linking and affinity 

25 chromatography studies, may be found in various commonly available and known 
laboratory manuals. See, e.g., Current Protocols in Protein Science. John Wiley & 
Sons, Secaucus, N.J., and Maniatis, T. et al. (19891 Molecular Cloning:. A Moratory 
Manual. Cold Spring Harbor University Press, Cold Spring, N.Y. 

It has been found that certain phage or other viruses inhibit host cells, at least 

30 in part, by producing an antisense RNA which binds to and inhibits translation from a 
bacterial RNA seqeunce. Thus, in the case of potentially inhibitor RNA transcripts 
encoded by the phage genome, a strong indicator of a possible inhibitory function is 
provided by the identification of phage sequence which is the identical to or fully 
complementary (or with only a small percentage of mismatch, e.g., <10%, preferably 

35 less than 5%, most preferably less than 3%, to a bacterial sequence. This approaches 
convenient in the case of bacteria that have been essentially completely sequenced, as 
the comparison can be performed by computer using public database information. 
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The inhibitory effect of the transcript can be confirmed using expression of the 
phage sequence in a host bacterium. If needed, such inhibitory can also be tested by 
transfecting the cells with a vector that will transcribe the phage sequence to form 
RNA in such manner that the RNA produced will not be translated into a polypeptide. 
5 Inhibition under such conditions provides a strong indication that the inhibition is due 
to the transcript rather than to an encoded polypeptide. 

In an alternative, the expression of an ORF in a host bacterium is found to be 
inhibitory, but the inhibition is found to be due to an RNA product of the genomic 
coding region. For antisense inhibition, the sequence of the bacterial target nucleic 

1 0 acid sequence can be identified by inspection of the phage sequence, and the full 
sequence of the relevant coding region for the bacterial product can be found from a 
database of the bacterial genomic sequence or can be isolated by standard techniques 
(e.g., a clone in a genomic library can be isolated which contains the full bacterial 
ORF, and then sequenced), 

1 5 In either case, the identification of a target which is inhibited by an RNA 

transcript produced by a phage provides both the possible inhibition of bacteria 
naturally containing the same target nucleic acid sequence, as well as the ability to use 
the target sequence in screening for other types of compounds which will act directly 
on the target nucleic acid sequence or on a polypeptide product expressed or 

20 regulated, at least in part, by the target of the inhibitory phage RNA. 

In some cases it will be found that the target of an inhibitory phage RNA or 
protein has previously been found to be a target of an inhibitory phage RNA or 
protein has previously been found to be a target for an antibacterial agent. In such 
cases, the phage inhibitor can still provide useful information if it is found that the 

25 phage-encoded product acts at a different site than the previously identified 

antibacterial agent or inhibitor, i.e., acts at a phage-specific site. For many targets, 
action at a different site provides highly beneficial characteristics and/or information. 
For example, an alternate site of inhibitor action can at least partially overcome a 
resistance mechanism in a bacterium. As an illustration, in many cases, resistance is 

30 due, in large part, to altered binding characteristics of the immediate target to the 
antibacterial agent The altered binding is due to a structural change which prevents 
or destabilizes the binding. However, the structural change is frequently quite local, 
so that compounds which bind at different local sites will b unaffected or affected to a 
much lesser degree. Indeed, in some cases the local sites will be on a different 

35 molecule and so may be completely unaffected by the local structural change dealing 
resistance to the original agent(s). An example of resistance due to altered binding is 
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provided by methicillin-resistant Staphylococcus aureus, in which the resistance is 
due to an altered penicillin-binding protein. 

In other cases, a new site of action can have improved accessibility as 
compared to a site acted on by a previously identified agent. This can, for example, 
5 assist in allowing effective treatment at lower doses, or in allowing access by a larger 
range of types of compounds, potentially allowing identification of more potential 
active agents. 

Another advantage is that the structural characteristics of a different site of 
action will lead to identification and/or development of inhibitors with different 
10 structures and different pharmacological parameter. This can allow a greater range of 
possibilities when selecting an antibacterial agent. 

Yet further, different sites often produce different inhibitory characteristics in 
the target organism. This is commonly the case for multi-domain target proteins. 
Thus, inhibition targeting an alternate site can produce more efficacious action, e.g.* 
1 5 faster killing, slower development of resistance, lower numbers of surviving cells, and 
different secondary effects (for example, different nutrient utilization). 

Staphylococcus aureus phage 77 

As indicated above, the present invention is concerned, in part, with the use of 

20 bacteriophage 77 coding sequences and the encoded polypeptides or RNA transcripts 
to identify bacterial targets for potential new antibacterial agents. 

As described, phage 77 ORFs 17, 19, 43, 102, 104, and 182 have been found 
to have bacteria inhibiting function. Identification of ORFs 17, 19, 43, 102, 104, and 
1 82 and products from the phage which inhibit the host bacterium both provides an 

25 inhibitor compound and allows identification of the bacterial target afTected by the 
phage-encoded inhibitor. Such a target is thus identified as a potential target for 
development of other antibacterial agents or inhibitors and the use of those targets to 
inhibit those bacteria. As indicated above, even if such a target is not initially 
identified in a particular bacterium, such a target can still be identified if a 

30 homologous target is identified in another bacterium. Usually, but not necessarily, 
such another bacterium would be a genetically closely related bacterium. Indeed, in 
some cases, an inhibitor encoded by phage 77 ORF 17, 19, 43, 102, 104, or 182 can 
also inhibit such a homologous bacterial cellular component. __ * 

Possible bacterial target sequences are described herein by reference to sequence 
35 source sites. In preferred embodiments, the sequence encoding the target corresponds 
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to a S. aureus nucleic acid sequence available from numerous sources including S. 
aureus sequences deposited in GenBank, S. aureus sequences found in European 
Patent Application No. 971001 10.7 to Human Genome Sciences, Inc. filed January 7, 
1 997, S. aureus sequences available from TIGR at 
5 http://www.tigr.org/tdb/mdb/mdb.html. and 5. aureus sequences available from the 
Oklahoma University S. aureus sequencing project at the following URL: 
http://www.eenome.ou.edu/staph new.html . Such possible targets are particularly 
applicable to S aureus phages 77, 3A, 96, and 44 AHJD. 

The amino acid sequence of a polypeptide target is readily provided by 

1 0 translating the corresponding coding region. For the sake of brevity, the sequences 
are not reproduced herein. Also, in preferred embodiments, a target sequence 
corresponds to a S. aureus coding sequence corresponding to a sequence listed in 
Table 15 herein. The listing in Table 15 describes S. aureus sequences currently listed 
with GenBank. Again, for the sake of brevity, the sequences are described by 

15 reference to the database accession numbers instead of being written out in full herein. 
In cases where an entry for a coding region is not complete, the complete sequence 
can be readily obtained by routine methods, e.g., by isolating a clone in a phage host 
S. aureus genomic library, and sequencing the clone insert to provide the relevant 
coding region. The boundaries of the coding region can be identified by conventional 

20 sequence analysis and/or by expression in a bacterium in which the endogenous copy 
of the coding region has been inactivated and using subcloning to identify the 
functional start and stop codons for the coding region. 

Stavhvloccus aureus phage 44 AHJD 
25 The present invention also can utilize the identification of naturally occuring 

DNA sequence elements within Staphylococcus aureus bacteriophage 44AHJD which 

encode proteins with antimicrobial activity. 

Such identification can utilize bioinformatics identification of specific proteins 

(ORFs) utilized by Staphylococcus aureus bacteriophage 44AHJD during the viral life 

30 cycle, resulting in a slowing or arrest of growth of the bacterial host, or in death, of 
the Staphylococcus aureus host including lysis of the infected bacteria. Thus, some of 
the bacteriophage 44AHJD DNA sequences encoding these proteins (ORFs) are 
predicted to encode antimicrobial functions. Information derived from these DNA 
sequences and translated ORFs can, in rum, be utilized to develop inhibitory „ 

35 compounds by peptidomimetics that can also function as antimicrobials. In addition, 
the identification of the host bacterial proteins that are targeted and inhibited by the 
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antimicrobial bacteriophage ORFs can themselves provide novel targets for drug 
discovery. 

The methodology described above is used to identify and characterize DNA 
sequences from Staphylococcus sp. bacteriophage 44 AHJD that have antimicrobial 
5 activity. As described in the Examples, the Staphylococcus aureus propagating strain 
(PS 44A), obtained from the Felix d'Hereile Reference Centre (#HER 1 101), was 
used as a host to propagate its phage 44AHJD, also obtained from the Felix d'Hereile 
Reference Centre (#HER 101). By sequencing, we found that bacteriophage 44 AHJD 
consists of 16,668 bp (Table 16) predicted to encode 73 ORFs greater than 33 amino 
10 acids (Tables 17 & 1 8). Computational analysis of the predicted protein products of 
Staphylococcus aureus bacteriophage 44AHJD identified homolgs in public sequence 
databases as listed inTable 19 and 20, along with the accompanying list of related 
proteins. 

From this analysis, it is apparent that 3 genes (ORF 3, 7, and 8) are related to 

1 5 structural proteins found in other bacteriophages. These include genes predicted to 
encode a tail protein (ORF 3), an upper collar/connector protein of the phage virion 
(ORF 7), and a lower collar protein (ORF 8). Bioinformatics has also identified one 
gene whose product is likely involved in phage DNA synthesis. One gene (ORF 1) 
shows significant homology to DNA polymerases of a number of bacteriophages, 

20 bacteria and fungi, and the product of this gene is likely responsible for replicating 
the genetic material of bacteriophage 44 AHJD. ORF 2 encodes a protein with 
homology to the dinC gene of Bacillus subtilis that encodes a protein involved in 
teichoic acid biosynthesis. Teichoic acid is a polyphosphate polymer found in some, 
but not all, Gram positive organisms (and not in Gram negative organisms), where it 

25 is attached to the peptidoglycan layer. The phage protein may thus be involved in the 
synthesis of this material for incorporation into the cell wall, allowing enhanced lysis 
by the phage lysis enzymes or, as many enzymes can function in "reverse reactions", 
may be involved in its degradation allowing for penetration of the peptidoglycan and 
phage genome entry into the cell following adsorption. The similarity between 

30 Staphylococcus aureus bacteriophage 44 AHJD and E. coli phage T7 indicates that 
they may share similar mechanisms of replication and growth. Both phages belonglo 
the Pododviridae Family of bacteriophages and are members of the "T7-like M Genus 
of this Family (Ackermann and DuBow; Vlth ICTV Report). 
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Two genes, ORF 9 and 12, were identified with the potential to encode 
antimicrobial protein products. The homology alignments are shown in Tables 19 and 
20. The predicted product of ORF 9 is related to a class of genes which encodes 
lysozyme^ike functions, enzymes which cleave linkages in the mucopolysaccharide 
5 cell wall structure of a variety of micro-organisms, including that from the 
Staphylococcus aureus bacteriophage Twort. ORF 12 of Staphylococcus aureus 
bacteriophage 44AHJD shows homology to a set of lysis proteins from several 
bacteriophages. These lysis proteins are also referred to as holins, and represent 
phage-encoded lysis functions required for transit of the phage murein hydrolases 
1 0 (Iysozyme) to the periplasm, where it can digest the cell wall and thus lyse the 
bacterium. 

Thus, in particular embodiments, the present invention provides a nucleic acid 
sequence isolated from Staphylococcus aureus bacteriophage 44AHJD comprising at 
least a portion of one of the genes described above with antimicrobial activity. For 

15 example, ORF 1 encodes a DNA polymerase function. This polymerase may utilize 
host-derived accessory proteins for its activity when replicating the phage template, 
sequestering such proteins from use by the bacterial polymerase, resulting in 
inhibition of DNA replication, cell division, and cell growth. Alternatively, ORF 9 
directly encodes a polypeptide with antimicrobial activity. ORF 9 is predicted to 

20 encode an amidase, a protein known to act as a cell wall degrading enzyme. ORF 12 
likely encodes a holin function required for transit of the phage amidase (gene 9 
product) to the periplasm. When this type of gene product from Bacillus phage phi 29 
(gene 14), was cloned in Escherichia coli, cell death ensued (Steiner et al„ 1993). 
Thus, production of proteins from Bacillus phage phi 29 gene 14 in E. coli resulted in 

25 cell death, whereas production of protein from Bacillus phage phi 29 gene 1 4 

concomitantly with the phi 29 Iysozyme or unrelated murein-degrading enzymes led 
to lysis, suggesting that membrane-bound protein 14 induces a nonspecific lesion in 
the cytoplasmic membrane (Steiner et al., 1993). 

The present invention also provides the use of the Staphylococcus 

30 bacteriophage 44 AHJD antimicrobial ORFs or ORF products as pharmacological 
agents, either wholly or in part and derivatives, as well as the use of correspoiufing 
peptidomimetics, developed from amino acid or nucleotide sequence knowledge 
derived from Staphylococcus bacteriophage 44 AHJD killer ORFs. 
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Enterococcus phage 1 82 

Bacteriophage 182 was obtained from the Felix D'Herelle phage collection 

(Ste. Foy, Quebec) and infects Enterococcus sp. Group D. The genome of 

5 Enterococcus bacteriophage 182 consists of 17,833 bp (Table 21) and is predicted to 

encode 80 ORFs greater than 33 amino acids (Tables 22 and 23). Computational 

analysis of the predicted protein products of Enterococcus bacteriophage 182 was 

performed in order to identify protein products related to those deposited in public 

databases. Bacteriophage 182 protein products which detected sequences with 

10 significant sequence similarity in public databases are listed in Table 24 and 26, along 
with the accompanying list of related proteins. 

From this analysis, it is apparent that 5 genes (ORF 001, 004, 007, 009, and 
011) are related to structural proteins of several Bacillus phages - Bacillus 
bacteriophage PZA, phi-29, and B103. These include genes predicted to encode a tail 

15 protein (ORF 001), a head protein (ORF 004), and upper collar protein (ORF 007), a 
lower collar protein (ORF 009), and a pre-neck appendage protein (ORF 011). Two 
gene products are predicted to encode genes which direct phage morphogenesis - 
these are ORF 005 and 019. 

Bioinformatics has also identified three genes whose products are likely 

20 involved in phage DNA synthesis. One gene, ORF 002 shows significant homology to 
DNA polymerases of a number of bacteriophages, and the product of this gene is 
likely responsible for replicating the genetic material of bacteriophage 182. ORF 006 
encodes a protein with homology to the encapsidation proteins of several other 
bacteriophages, including Bacillus phage phi-29 (PI 1014), PZA (P07541), and B103 

25 (X99260) and Streptococcus phage CP-1 (Z47794). These gene products catalyze the 
in vivo and in vitro genome-encapsidation reaction (Garvey et a!., 1985). Proteins 
involved in genome packaging have been shown to have additional activities that 
affect biochemical reactions in other phages and their hosts. For example, the coat 
protein of the RNA bacteriophage MS2 interacts with viral RNA to translationally 

30 repress replicase synthesis (Pickett and Peabody, 1993). This protein-RNA interaction 
also plays a role in genome encapsidation, enveloping a single copy of the viral 
genome in a protein shell composed of many molecules of coat protein. In addition, 
the bacteriophage X terminase enzyme can be lethal to E. coli when expressed, 
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suggesting cleavage of packaging sites in the bacterial chromosome. Also present 
within bacteriophage 182 is a gene, ORF 010, that encodes a protein that is related to 
the terminal proteins of Bacillus phage Nf (P06812), Bacillus phage GA-1 (X96987) 
and Bacillus phage B103 (X99260). DNA terminal proteins are linked to the 5' ends 
5 of both strands of the genome and are essential for DNA replication playing a role in 
initial priming of DNA replication. The similarity between Enterococcus 
bacteriophage 182 and Bacillus phages phi-29, PZA, and B 103 indicates that they 
may share similar mechanisms of replication and growth, Protein-primed DNA 
replication is a well described phenomenon, and in the phi-29-like phages, the ends of 
10 the DNA serve as origins and termini of replication (Gutierrez et al., 1986; Yoshikawa 
et al, 1985). 

There is also a gene (ORF 015) that encodes a protein showing homology to 
an early protein product of Bacillus bacteriophage PZA and the single-strand nucleic 
acid binding protein of bacteriophage B103. 

15 Two genes, ORF 008 and 014, were identified with the potential to encode 

anti-microbial protein products. The homology alignments are shown in Tables 24 & 
26 and biochemical features of the predicted polypeptides shown in Table 25. The 
predicted product of ORF 008 is related to a class of genes which encodes lysozyme- 
like functions, enzymes which cleave linkages in the mucopolysaccharide cell wall 

20 structure of a variety of micro-organisms. ORF 014 of Enterococcus 182 shows 
homology to a set of lysis proteins from Bacillus bacteriophage phi-29, PZA, and 
B103. These lysis proteins are also referred to as holins and represent phage encoded 
lysis functions required for transit of the phage murein hydrolases (lysozyme) to the 
periplasm, where it can digest the outer cell wall and thus lyse the bacterium. 

25 Thus, the present invention provides a nucleic acid sequence obtained from 

Enterococcus bacteriophage 182 comprising at least a portion of a phage 182 ORF, 
preferably an inhibitory ORF, and more preferably at least a portion of one of the 
genes described above with anti-microbial activity. For example, ORF 002 encodes a 
DNA polymerase function. This polymerase may utilize host-derived accessory 

30 proteins for its activity when replicating the phage template, sequestering such 
proteins from use by the bacterial polymerase, resulting in inhibition of DNA 
replication, cell division, and cell growth. Alternatively, ORFs 008 or 014 directly 
encode polypeptides with anti-microbial activity. ORF 008 is predicted to encode an 
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autolytic lysozyme, a protein known to have anti -microbial activity (Martin et ai y 
1998). ORF 014 likely encodes a holin function required for transit of the phage 
murein hydrolases to the periplasm. When the related product from Bacillus phage phi 
29 (gene 14), was cloned in Escherichia colU ceil death ensued (Steiner et aL, 1993). 
5 Thus, production of proteins from Bacillus phage phi 29 gene 14 in E. colt resulted in 
cell death, whereas production of protein from Bacillus phage phi 29 gene 14 
concomitantly with the phi 29 lysozyme or unrelated murein-degrading enzymes led 
to lysis, suggesting that membrane-bound protein 14 induces a nonspecific lesion in 
the cytoplasmic membrane (Steiner et aL, 1993). 

1 0 The present invention also provides the use of the Enterococcus bacteriophage 

182 anti-microbial ORFs as pharmacological agents, either wholly or in part and 
derivatives, as well as the use of corresponding peptidomimetics, developed from 
amino acid or nucleotide sequence knowledge derived from Enterococcus 
bacteriophage 182 killer ORFs. This can be done where the structure of the 

15 peptidomimetic compound corresponds to the structure of the active portion of a 
product of an ORF. In this analysis, the peptide backbone is transformed into a carbon 
based hydrophobic structure that can retain cytostatic or cytocidal activity for the 
bacterium. This is done by standard medicinal chemistry methods, measuring growth 
inhibition of the various molecules in liquid cultures or on solid medium. These 

20 mimetics also represent lead compounds for the development of novel antibiotics. In 
this context, "corresponds" means that the peptidomimetic compound structure has 
sufficient similarities to the structure of the active portion of a product of one of the 
Enterococcus ORFs listed, that the peptidomimetic will interact with the same 
molecule as the product of the ORF, and preferably will elicit at least one cellular 

25 response in common which relates to the inhibition of the cell by the phage protein. 

To validate the identity of an ORF as a killer ORF, it is preferably expressed 
in the host or other test bacterial organism and the effect of this expression on 
bacterial growth and replication is assessed. Therefore, all individual ORFs identified 
herein, e.g., those identified above, can be expressed, preferably overexpressed, in a 

30 suitable host bacterium e.g., a host Enterococcus and the effect of this expression or 
overexpression on host metabolism and viability can be measured. 

Individual ORFs can be resynthesized from the phage genomic DNA by the 
polymerase chain reaction (PCR) using oligonucleotide primers flanking the ORF on 
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either side. Those skilled in the art are familiar with the design and synthesis of 
appropriate primer sequences. These single ORFs are preferably engineered so that 
they contain appropriate cloning sites at their extremities to allow their introduction 
into a new bacterial expression plasmid, allowing propagation in a standard bacterial 
5 host such as E. coli, but containing the necessary information for plasmid replication 
in the target microbe, Enterococcus sp. (hereafter referred to as a shuttle vector). 

This shuttle vector also preferably contains regulatory sequences that allow 
inducible expression of the introduced ORF. As the candidate ORF may encode a 
killer function that will eliminate the host, it is highly advantageous that it not be 

10 expressed (or at least not expressed at a substantial level) prior to testing for activity; 
thus screening for such sequences in a constitutive fashion is less likely to be 
successful (lethality). In an example presented in Fig. 7, regulatory sequences from 
the ars operon are used to direct individual ORF expression in Enterococcus. The ars 
operon encodes a series of proteins which normally mediate the extrusion of arsenite 

15 and several other trivalent oxyanions from the cells when they are exposed to such 
toxic substances in their environment. The operon encoding this detoxifying 
mechanism is normally silent and only induced when arsenite-related compounds are 
present. 

Therefore, individual phage ORFs can be expressed in Enterococcus or other 
20 suitable host in an inducible fashion by adding to the culture medium non-toxic 
arsenite concentrations during the growth of individual Enterococcus (or other host 
cells) clones expressing such individual phage ORFs. Toxicity of the phage killer 
ORF for the host is monitored by reduction or arrest of growth under induction 
conditions, as measured by optical density in liquid culture or after plating the 
25 induced cultures on solid medium. Subsequently, interference of the phage ORF with 
the host biochemical pathways ultimately leading to reducing or arresting host 
metabolism can be measured by pulse chase experiments using radiolabeled 
precursors of either DNA replication, RNA transcription, or protein synthesis. 

Of course, other inducible regulatory sequences (e.g., promoters, operators, 
30 etc.) may be used (e.g., systems using positive induction of expression or systems . 
using release of repression). A variety of such systems are known to those-skilled in 
the art and can be utilized in the present invention. 
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Nucleic acid sequences of the present invention can be isolated using a method 
similar to those described herein or other methods known to those skilled in the art. 
In addition, such nucleic acid sequences can be chemically synthesized by well- 
known methods. Having the phage 182 ORFs, e.g., anti-bacterial ORFs of the present 
5 invention, portions thereof, or oligonucleotides derived therefrom as described, other 
anti-microbial sequences from other bacteriophage sources can be identified and 
isolated using methods described here or other methods, including methods utilizing 
nucleic acid hybridization and/or computer-based sequence alignment methods. 

The invention also provides bacteriophage anti-microbial DNA segments from 

1 0 other phages based on nucleic acids and sequences hybridizing to the presently 
identified inhibitory ORF under high stringency conditions or sequences which are 
highly homologous. The bacteriophage anti-microbial DNA segment from 
bacteriophage 182 can be used to identify a related segment from another unrelated 
phage based on stringent conditions of hybridization or on being a homolog based on 

15 nucleic acid and/or amino acid sequence comparisons. As with the phage 182 

inhibitory sequences, such homologous coding sequences and products can be used as 
antimicrobials, to construct active portions or derivatives, to construct 
peptidomimetics, and to identity bacterial targets. 

Enterococcus sequences are listed in Table 27 by accession number, providing 

20 identification of possible targets of Enterococcus phage inhibitory ORF products, e.g., 
from phage 182. 

Streptococcus pneumoniae 

As indicated in the Summary above, the present invention is concerned 

25 with the use of Streptococcus sp. bacteriophage Dp-1 coding sequences and the 
encoded polypeptides or RNA transcripts to identify bacterial targets for potential new 
antibacterial agents. 

Streptococcus pneumoniae is an important cause of community-acquired 
pneumonia and a major cause of otitis media, sinusitis, and meningitis in children and 

30 adults. In Spain and other Mediterranean countries, the majority of S. pneumoniae are 
relatively resistant to penicillin (Klugman, 1990; Fenoll et al., 1991; Jorgenserret'al., 
1990). These strains also have decreased susceptibility to broad-spectrum 
cephaloporins, which are frequently used in the empiric treatment of meningitis and 
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other serious invasive bacterial infections. High-level resistance of pneumococci has 
been encountered in Hungary where 70% of children who were colonized with S. 
pneumoniae carried penicillin resistant strains that were also resistant to tetracycline, 
erythromycin, trirnethoprim/sulfarnethoxazole, and 30% resistant to chloramphenicol 
5 (Neu, 1992). The resistance of pneumococci to macrolides such as erythromycin 
averages 20-25% in France, -20% in Japan, and <10% in Spain (Neu, 1992). 

The antimicrobial susceptibilities and distribution of serotypes of the 42 
isolates of S. pneumoniae in southern Taiwan from invasive infections have been 
recently determined (Hseuh et al., 1996). Resistance rates among these isolates were: 

10 erythromycin, 61.9%; clindamycin, 47.6%; chloramphenicol, 19%; and tetracycline, 
73.8%. Resistance to three or more classes of antibiotics was found in 33.3% of the 
isolates. Bacteremic pneumonia and primary bacteremia accounted for 64.3% of the 
infections and mortality was 42.6%. Given the severity of these infections despite 
adequate antibiotic therapy, there is clearly a need for introduction of new therapeutic 

1 5 options to prevent mortality due to invasive 5. pneumoniae infections. 

Pneumococcal phages belong to four families and they present a great variety 
in morphology, including lytic and temperate phages (for a review, see Garcia et al., 
1997). Examples of lytic phages are Cp-1 and Dp- 1, whereas examples of temperate 
phages are HB-3, EJ-1, and HB-746. The complete nucleotide sequence and 

20 functional organization of Cp-1 has been reported (Martin et al., 1996). Cp-1 has a 
19,345 bp double-stranded DNA genome, with a terminal protein covalently linked to 
its 5* ends, that replicates by a protein primed mechanism. The phage contains 29 
ORFs, 23 on one strand and 6 on the opposite. When these predicted proteins were 
compared to sequences compiled in GenBank EMBL databases, to ORFs showed 

25 significant similarity to proteins of bacteriophage 29 that infects B. subtilis (Martin et 
al., 1996). The similar proteins corresponded to those involved in DNA replication 
(tenninal protein and DNA polymerase), structural and morphogenic proteins (major 
head, collar, connector, tail, and encapsidation proteins), and proteins involved in lysis 
function (holin and lysozyme). In its strategy of lysis, the holin gene product inserts 

30 itself into the cell membrane, allowing access of the lysozyme to the peptidoglycan.. t 
Expression of the Cp-1 holin protein in E. coli results in cell death after 2-hours of 
induction, but did not lead to lysis (Garcia et al., 1997). Cells harboring a plasmid 
construction with holin and lysozyme genes together did lyse after induction and the 
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viability loss was similar to that of the culture expressing holin alone. Cloning of 
these lytic genes in S. pneumoniae showed that both genes had the same effect as in E. 
coli. That is, holin itself did not lyse the culture but the viability loss was noticeable, 
whereas both holin and lysozyme together were capable of lysing M31, an araidase 
5 deleted mutant (Garcia et al., 1 997). 

Recently, a small portion (-4 kbp) of a second S. pneumoniae phage, Dp-1, 
has been sequenced (Sheehan et al., 1997). This portion contains the genes coding for 
the lytic system (Sheehan et al, 1997) and shows a modular organization similar to 
that described for Cp-1. However, in this case, a single chimeric protein appears to be 

1 0 made in which the N-terminal domain is highly similar to that of the murem hydrolase 
coded by a gene found in the phage BK5-T that infects Lactococcus lactis, and the C- 
terminal domain is homologous to holins. Thus, both functions appear to have been 
combined in a novel chimeric protein. 

Bacteriophage Dp-1 was obtained from Dr. P. Garcia (Departamento de 

15 Microbiologia Molecular, Centro de Departamento de lnvestigaciones Biologicas, 
Consejo Superior de lnvestigaciones Cientificas, Velazquez, Madrid, Spain). We 
found that Dp- 1 has a double-stranded DNA genome of 56,506 bp, predicted to 
encode 85 ORFs greater than 33 amino acids and with upstream Shine-Dalgarno 
motifs for translation initiation (Tables 28 & 30, and Fig. 6). Computational analysis 

20 of the predicted protein products of Streptococcus bacteriophage Dp-1 protein 
products, which detected homologs in public databases, are listed inTable 31, along 
with the accompanying list of related proteins. 

From this analysis, it is apparent that several predicted genes of Dp-1 encode 
polypeptides that are related to structural proteins. ORFs 001, 002, 004, and 030 are 

25 predicted to encode tail proteins, minor structural proteins, and minor capsid proteins 
(Table 31). We also note the identification of several gene products that are likely 
involved in DNA synthesis. These include ORF 3 which encodes DNA polymerase, 
ORF 8 which encodes a SWI/SNF helicase-related protein, ORF 10 encodes a protein 
showing homology to recA, and ORF 13 encodes a dnaZX-like ORF. 

30 In E. co/t, RapA encodes an RNA polymerase (RNAP)-associated protein with . 

ATPase activity and which is a homolog of the eukaryotic SWI/SNF family, a set of 
proteins whose members are involved are involved in transcription activation, 
nucleosome remodeling, and DNA repair. RapA forms a stable complex with RNAP, 
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as if it were a subunit of RNAP and it is possible that the ORF 8 product behaves 
similarly or in a dominant-negative fashion to inhibit the activity of RapA. Mutation 
of the essential E. coli dnaZX results in a block in DNA chain elongation during 
replication (Maki et al., 1988). The dnaZX gene has only one open reading frame for 
5 a 7l-kDa polypeptide from which the two distinct DNA polymerase III holoenzyme 
subunits, tau (71 kDa) and gamma (47 kDa), are produced. The tau subunit is the 
precursor of the gamma subunit, and the gamma subunit is produced by a -1 
frameshift causing early termination of translation (Tsuchihashi et al., 1990). These 
proteins show single-strand DNA binding properties that is ATPase (and dATPase) 

1 0 dependent and are thought to increasing the processivity of the core DNA polymerase 
enzyme (Lee et al., 1987). 

There are several Dp- 1 ORFs which encode proteins predicted to play a role in 
cellular metabolic pathways. These include polypeptides involved in coenzyme PQQ 
synthesis (ORFs 20, 29, 38). Pyrrolo-quinoline quinone (PQQ) is the non-covalently 

1 5 bound prosthetic group of many quinoproteins catalysing reactions in the periplasm of 
Gram-negative bacteria. Most of these involve the oxidation of alcohols or aldose 
sugars. Interestingly, ORFs 20, 29, and 30 also show homology to the exoenzyme S 
regulon (Frank, 1997). Proteins encoded by the P. aeruginosa exoenzyme S regulon 
may be involved in a contact-mediated translocation mechanism to transfer anti-host 

20 factors directly into eukaryotic cells disrupting eukaryotic signal transduction through 
ADP-ribosylation (Frank, 1997). 

There is also a protein with similarity to GTP cyclohydrolase I (ORF 21) and 
ORF 41 which shows homology to dUTPase (Table 31). GTP cyclohydrolase I is an 
enzyme that catalyzes the first reaction in the pathway for the biosynthesis of the 

25 pteridine, a cofactor of the monooxygenases of the aromatic amino acids. Disruption 
of the homologous gene in Saccharomyces cerevisiae leads to a recessive conditional 
lethality due to folinic acid auxotrophy, that can be complemented with the 
mammalian or bacterial GTP cyclohydrolase 1 enzymes (Nardese et al., 1996; Mancini 
etal.,1999). 

30 ORF 16 shows high homology to autolysin. This region of the phage sequence 

was previously reported (Sheehan et al., 1997) and encompasses - 4 kbp of our 
sequence. The sequence published by (Sheehan et al., 1997) is shown in Table 32. 

Thus, the present invention provides a nucleic acid sequence obtained from 
Streptococcus bacteriophage Dp-1 comprising at least a portion of a phage Dp-J .QRF; 

35 preferably an inhibitory ORF, and more preferably at least a portion of one of the 
genes described above with anti-microbial activity. For example, ORF 013 encodes a 
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protein with homology to the gamma subunit of DNA polymerase (dnaX gene). This 
protein may act in a dominant-negative fashion to sequester the host DNA polymerase 
for its own replication, thus inhibiting host DNA replication. The dnaX gene product 
is essential for coli replication (Kodaira et al„ 1983). 

5 In certain preferred embodiments of the present invention, the bacterial target of 

a bacteriophage inhibitor ORF product, e.g., an inhibitory protein or polypeptide, is 
encoded by a Streptococcus nucleic acid coding sequence from a host bacterium for 
bacteriophage Dp-1. As above, possible target sequences are described herein by 
reference to sequence source sites. The sequence encoding the target preferably 

10 corresponds to a Streptococcus nucleic acid sequence available from The Institute for 
Genomic Research (TIGR), or available from GenBank or other public database. The 
TIGR Streptococcus sequences are publicly available at The Institute for Genomics 
Research at URL: http://www.tigr.org 

The amino acid sequence of a polypeptide target is readily provided by 

15 translating the corresponding coding region. For the sake of brevity, the sequences 
are not reproduced herein. Also, in preferred embodiments, a target sequence 
corresponds to a Streptococcus pneumoniae coding sequences corresponding to a 
sequence listed in Table 33 herein. Sequences for other Streptococcal species are also 
available from TIGR andVor from GenBank. The listing in Table 33 describes 

20 Streptococcus sequences currently deposited in GenBank. Again, for the sake of 
brevity, the sequences are described by reference to the GenBank entries instead of 
being written out in full herein. In cases where the TIGR or GenBank entry for a 
coding region is not complete, the complete sequence can be readily obtained by 
routine methods, e.g. f by isolating a clone in a phage Dp-1 host Streptococcus sp. 

25 genomic library, and sequencing the clone insert to provide the relevant coding 
region. The boundaries of the coding region can be identified by conventional 
sequence analysis and/or by expression in a bacterium in which the endogenous copy 
of the coding region has been inactivated and using subcloning to identify the 
functional start and stop codons for the coding region. 

30 In the various aspects of this invention involving Dp-1 sequences, preferably the 

sequence is preferably not contained in the sequence described in Sheehan et al., 1997 
(Table 32). 

Validating Identified Inhibitory Phage ORFs 
35 A fifth step involves validating the identified phage inhibitor ORF by 

independent methods, and delineating further possible smaller segments of the ORFs 
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that have inhibitory activity. Several methods exist to validate the role of the 
identified ORF as an inhibitor ORF. 

One example utilizes the creation of a mutant variant of the phage ORF in 
which the candidate ORF carries a partial or complete loss-of-function mutation that 
5 is measurable as compared with the non-mutant ORF. Comparison of the effects of 
expression of the loss of function mutant with the normal ORF provides confirmation 
of the identification of an inhibitor ORF where the loss-of-function mutant provides a 
measurably lower level of inhibition, preferably no inhibition. The loss of function 
may be conditional, e.g., temperature sensitive. 

10 Once validation of the inhibitor ORF is achieved, a bi-directional deletion 

analysis can be carried out using the same experimental system to identify the 
minimal polypeptide segment that has inhibitor activity. This may be carried out by a 
variety of means, e.g., by exonuclease or PCR methodologies, and is used to 
determine if a relatively small segment of the ORF (/.e., the product of the ORF) still 

15 possesses inhibitory activity when isolated away from its native sequence. If so, a 
portion of the ORF encoding this "active portion" can be used as a template for the 
synthesis of novel anti-microbial agents and further allowing derivation of the peptide 
sequence, e.g., using modified peptides and/or peptidomimetics. 

In creation of certain peptidomimetics, the peptide backbone is transformed 

20 into a carbon-based hydrophobic structure that can retain inhibitor activity against the 
bacterium. This is done by standard medicinal chemistry methods, typically 
monitored by measuring growth inhibition of the various molecules in liquid cultures 
or on solid medium. These mimetics can also represent lead compounds for the 
development of novel antibiotics. 

25 Recently, a major effort has been undertaken by the pharmaceutical industry 

and their biotechnology partners for the sequencing of bacterial pathogen genomes. 
The rationale is that the systematic sequencing of the genome will identify all of the 
bacterial proteins and therefore this proteome will be the target for designing novel 
inhibitor antibiotics. Although systematic, this approach has several major problems. 

30 The first is that analysis of primary amino acid sequences of bacterial proteins does 
not immediately reveal which protein will be essential for viability of the bacterium, 
and target validation is thus a major issue. The second problem is one of redundancy, 
as several biochemical pathways are either structurally duplicated in bacteria 
(different isoforms of the same enzyme), or functionally duplicated by the presence of. , 

35 salvage pathways in the event of a metabolic block in one pathway (different ~~ 
nutritional conditions). The third is that even a valid target may not be structurally or 
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functionally amenable to inhibition by small molecules because of inaccessibility 
(sequestration of target). 

Therefore, there is considerable interest within the pharmaceutical and 
biotechnology industry in identifying key targets for drug discovery amongst the mass 
5 of novel targets generated by large-scale genomic sequencing projects. 

On the other hand, and underscoring the instant invention, the phages herein 
described have, over millions of years, evolved specific mechanisms to target such 
key biochemical pathways and proteins. In the few cases where inhibition by phages 
has been elucidated (e.g., see ref. 3), such bacterial targets are invariably rate-limiting 
10 in their respective biochemical pathways, are not redundant, and/or are readily 
accessible for inhibition by the phage (or by another inhibitory compound). 
Therefore, the sixth step of this invention involves identifying the host biochemical 
pathways and proteins that are targeted by the phage inhibitory mechanisms. 

15 Identifying. Validating, and Characterizing Bacterial Host Target Proteins and 
Affected Pathways. 

A rationale for this step is that the inhibitor ORF product from the phage 
physically interacts with and/or modifies certain microbial host components to block 
their function. Exemplary approaches which can be used to identify the host bacterial 

20 pathways and proteins that interact with, and preferably also are inhibited by, phage 
ORF produces) are described below. 

One approach is a genetic screen to determine physiological proteimprotein 
interaction, for example, using a yeast two hybrid system. In this assay, the phage 
ORF is fused to the carboxyl terminus of the yeast Gal4 activation domain II (amino 

25 acids 768-88 1) to create a bait vector. A cDNA library of cloned S. aureus sequences 
which have been engineered into a plasmid where the S. aureus sequences are fused to 
the DNA binding domain of Gal4 is also generated. These plasmids are introduced 
alone, or in combination, into yeast strain Y190 - previously engineered with 
chromosomally integrated copies of the E. coli lacZ and the selectable HIS3 genes, 

30 both under Gal4 regulation (Durfee, T., Becherer, K., Chen, P.-L., Yeh, S.-H., Yang, 
Y. f Kilburn, A.E., Lee, W.-H., and Elledge, S.J. (1993). Genes & Dev. 7, 555-569), If 
the two proteins expressed in yeast interact, the resulting complex will activate 
transcription from promoters containing Gal4 binding sites. A lacZ and His3 gene, 
each driven by a promoter containing Gal4 binding sites, have been integrated into the. . 

35 genome of the host yeast system used for measuring protein-protein interactions. Such 
a system provides a physiological environment in which to detect potential protein 
interactions. This system has been extensively used to identify novel protein-protein 
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interaction partners and to map the sites required for interaction (for example, to 
identify interacting partners of translation factors (Qiu, H., Garcia-Barrio, MX, and 
Hinnebusch, A.G. (1998). Mol & Cell Biology 18, 2697-271 1), transcription factors 
(Katagiri, T., Saito, H., Shinohara, A., Ogawa,R, Kamada,N., Nakamura ,Y., and 
5 Miki, Y. ( ! 998). Genes, Chromosomes & Cancer 21,21 7-222), and proteins involved 
in signal transduction (Endo, T.A., Masuhara, M., Yokouchi, M., Suzuki, R., 
Sakamoto, H., Mitsui, K., Matsumoto, A., Tanimura, S., Ohtsubo, M., Misawa, H., 
Miya2aki, T., Leonor N., Taniguchi, T„ Fujita, X, Kanakura,Y„ Komiya,S., and 
Yoshimura, A. Mature. 387, 921-924). This approach has also been used in many 
1 0 published reports to identify interaction between mammalian viral and mammalian 
cell proteins. 

For example, the non-structural protein NS 1 of parvovirus is essential for viral 
DNA amplification and gene expression and is also the major cytopathic effector of 
these viruses. A yeast two-hybrid screen with NS1 identified a novel cellular protein 

1 5 of unknown function that interacts with NS- 1 , called SGT, for small glutamine-rich 
tetratricopeptide repeat (TPR)-containing protein (Cziepluch C. Kordes E. Poirey R. 
GrewenigA. Rommelaere, J, and JauniauxJC. (1998)7 Virol. 72,4149-4156). In 
another screen, the adenovirus E3 protein was recently shown to interact with a novel 
tumor necrosis factor alpha-inducible protein and to modulate some of the activities of 

20 E3 (Li Y. Kang J. and Horwitz M.S. (1998). Mol & Cell Biol 18, 1601-1610). In yet 
another recent screen, the herpes simplex virus 1 alpha regulatory protein ICP0 was 
found to interact with (and stabilize) the cell cycle regulator cyclin D3 (Kawaguchi Y. 
Van Sant C. and Roizman B. (1997). J Virol. 71,7328-7336). 

Another two-hybrid system for identifying proteimprotein interactions is 

25 commercially available from STRATEGENE™ as the CYTO-TRAP™ system 
(Chang et al., Strategies Newsletter 1 1(3), 65-68 (1998)(from Stratagene)). The 
system is a yeast-based method for detecting protein.protein interactions in vivo, using 
activation of the Ras signal transduction cascade by localizing a signal pathway 
component, human Sos (hSos), to its activation site in the yeast plasma membrane. 

30 The system uses a temperature-sensitive Saccharomyces cerevisiae mutant, strain 
cdc25H, which contains a point mutation at amino acid residue 1328 of the cdc25 
gene. This gene encodes a guanyi nucleotide exchange factor which binds and 
activates Ras, leading to cell growth. The mutation in the cdc25 gene prevents host 
growth at 37°C, but at a permissive temperature of 25°C, growth is normal. The 

35 system utilizes the ability of (hSos) to complement the cdc25 defect and activate the 
yeast Ras signaling pathway. Once (hSos) is expressed and localized to the plasma 
membrane, the cdc25H yeast strain grows at 37°C Localizing hSos to the plasma 
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membrane occurs through a protein:protein interaction. A protein of interest, or bait, 
is expressed as a fusion protein with hSos. The library, or target proteins are 
expressed with the myristyiaiion membrane-localization signal. The yeast cells are 
then incubated under restrictive conditions (37°C). If the bait and the target protein 
5 interact, the hSos protein is recruited to the membrane, activating the Ras signaling 
pathway and allowing the cdc25H yeast strain to grow at the restrictive temperature. 

The protein targets of phage inhibitory ORFs can also be identified using 
bacterial genetic screens. One approach involves the overexpression of a phage 
inhibitory protein in mutagenized bacterial host species, followed by plating the cells 

10 and searching for colonies that can survive the antimicrobial activity of the inhibitory 
ORF. These colonies are then grown, their DNA extracted, and cloned into an 
expression vector that contains a replicon of a different incompatibility group from 
the plasmid expressing the original ORF. This library is then introduced into a wild- 
type host bacterium in conjunction with an expression vector driving synthesis of the 

1 5 phage ORF, followed by selection for surviving bacteria. Thus, bacterial DNA 
fragments from the survivors presumably contain a DNA fragment from the original 
mutagenized host bacterial genome that can protect the cell from the antimicrobial 
activity of the inhibitory phage ORF. This fragment can be sequenced and compared 
with that of the bacterial host to determine in which gene the mutation lies. This 

20 approach enables one to determine the targets and pathways that are affected by the 
killing function. 

A second approach is based on identifying protein:protein interactions 
between the phage ORF product and bacterial S. aureus, e.g., proteins using a 
biochemical approach based, for example, on affinity chromatography. This approach 

25 has been used, for example, to identify interactions between lambda phage proteins 
and proteins from their E. coli host (Sopta, M., Carthew, R.W., and Greenblatt, J. 
(1985) J. Biol Chem. 260, 10353-10369). The phage ORF is fused to a peptide tag 
(e.g. glutathione-S-transferase ("GST"), 6xHIS, ("HIS") and/or calmodulin binding 
protein ("CPB")) within a commercially available plasmid vector that directs high 

30 level expression on induction of a suitably responsive promoter driving the fusion's 
expression. The translated fusion protein is expressed in E. coli, purified, and 
immobilized on a solid phase matrix via, for example the tag. Total cell extracts from 
the host bacterium, e.g., S. aureus, are then passed through the affinity matrix 
containing the immobilized phage ORF fusion protein; host proteins retained on the 

35 column are then eluted under different conditions of ionic strength, pH, deteigents 
etc., and characterized by gel electrophoresis and other techniques. Appropriate 
controls are run to guard against nonspecific binding to the resin. Target proteins thus 
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recovered should be enriched for the phage protein/peptide of interest and are 
subsequently electrophoretically or otherwise separated, purified, sequenced, or 
biochemically analyzed. Usually sequencing entails individual digestion of the 
proteins to completion with a protease (e.g. -trypsin), followed by molecular mass and 
5 amino acid composition and sequence determination using, for example, mass 
spectrometry, e.g., by MALDI-TOF technology (Qin, J., Fenyo, D., Zhao, Y., Hall, 
W.W., Chao, D.M., Wilson, C.J., Young, R.A. and Chait, B.T. (1997). Anal. Chem. 
69, 3995-4001). 

The sequence of the individual peptides from a single protein are then 

1 0 analyzed by the bioinformatics approach described above to identify the S. aureus 
protein interacting with the phage ORF. This analysis is performed by a computer 
search of the S. aureus genome for an identified sequence. Alternatively, all tryptic 
peptide fragments of the S. aureus genome can be predicted by computer software, 
and the molecular mass of such fragments compared to the molecular mass of the 

1 5 peptides obtained from each interacting protein eluted from the affinity matrix. The 
responsible gene sequence can be obtained, for example by using synthetic degenerate 
nucleic acid sequences to pull out the corresponding homologous bacterial sequence. 
Alternatively, antibodies can be generated against the peptide and used to isolate 
nascent peptide/mRNA transcript complexes, from which the mRNA can be reverse 

20 transcribed, cloned, and further characterized using the procedures discussed herein. 

A variety of other binding assay methods are known in the art and can be used 
to identify interactions between phage proteins and bacterial proteins or other bacterial 
cell components. Such methods that allow or provide identification of the bacterial 
component can be used in this invention for identifying putative targets. 

25 Validation of the interaction between the phage ORF product and the bacterial 

proteins or other components can be obtained by a second independent assay (e.g., 
co-immunoprecipitation or protein-protein crosslinking experiments (Qiu, H., Garcia- 
Barrio, M.T., and Hinnebusch, A.G. (1998). Mol & Cell Biology 18, 2697-2711; 
Brown, S. and Blumenthal, T. (1976). Proc. Natl. Acad, Sci. USA 73, 1 131-1 135)). 

30 Finally, the essential nature of the identified bacterial proteins is preferably 

determined genetically by creating a constitutive or inducible partial or complete loss- 
of-function mutation in the gene encoding the identified interacting bacterial protein. 
This mutant is then tested for bacterial survival and replication. 

The protein target of the phage inhibitor function can also be identified using a . 

35 genetic approach. Two exemplary approaches will be delineated here. The firsP 
approach involves the overexpression of a predetermined phage inhibitor protein in 
mutagenized host bacteria, e.g., S. aureus, followed by plating the cells and searching 
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for colonies that can survive the inhibitor. These colonies will then be grown, their 
DNA extracted and cloned into an expression vector that contains a replicon of a 
different incompatibility group, and preferably having a different selectible marker 
than the plasmid expressing the phage inhibitor. Thus, host DNA fragments from the 
5 mutant that can protect the cell from phage ORF inhibition can be sequenced and 
compared with that of the bacterial host to determine in which gene the mutation lies. 
This approach allows rapid determination of the targets and pathways that are affected 
by the inhibitor. 

Alternatively, the bacterial targets can be determined in the absence of 
10 selecting for mutations using an approach known as "multicopy suppression". In this 
approach, the DNA from the wild type host is cloned into an expression vector that 
can coexist, as previously described, with one containing a predetermined phage 
inhibitor. Those plasmids that contain host DNA fragments and genes that protect the 
host from the phage inhibitor can then be isolated and sequenced to identity putative 
targets and pathways in the host bacteria. 

Regardless of the specific mode of identification, screening assays may 
additionally utilize gene fusions to specific "reporter genes" to identify a bacterial 
gene(s) whose expression is affected when the host target pathway is affected by the 
phage inhibitor. Such gene fusions can be used to search a number of small molecule 
compounds for inhibitors that may affect this pathway and thus cause cell inhibition. 
This approach will allow the screening of a large number of molecules on petri dishes 
or 96-weil format by monitoring for a simple color change in the bacterial colonies. 
In this manner, we can validate host targets and classes of compounds for further 
study and clinical development. These inhibitors also represent lead compounds for 
the development of other antibiotics. 

Bioinformatics and comparative genomics are preferably then applied to the 
identified bacterial gene products to predict biochemical function. The biochemical 
activity of the protein can be verified in vitro in cell free assays or in vivo in intact 
cells. In vitro biochemical assays utilizing cell-free extracts or purified protein are 
established as a basis for the screening and development of inhibitors. 

These inhibitors, preferably small molecule inhibitors, may comprise peptides, 
antibodies, products from natural sources such as fungal or plant extracts or small 
molecule organic compounds. In general, small molecule organic compounds are 
preferred. These compounds may, for example, be identified within large compound , . 
libraries, including combinatorial libraries. For example, a plurality of compounds, 
preferably a large number of compounds can be screened to determine whether any of 
the compounds binds or otherwise disrupts or inhibits the identified bacterial target. 
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Compounds identified as having any of these activities can then be evaluated further 
in cell culture and/or animal model systems to determine the pharmacological 
properties of the compound, including the specific anti-microbiai ability of the 
compound. 

5 For mixtures of natural products, including crude preparations, once a 

preparation or fraction of a preparation is shown the have an anti-microbiai activity, 
the active substance can be isolated and identified using techniques well known in the 
art, if the compound is not already available in a purified form. 

Identified compounds possessing anti-microbiai activity and similar 
10 compounds having structural similarity can be further evaluated and, if necessary, 
derivatized according to synthesis and/or modification methods available in the art 
selected as appropriate for the particular starting molecule. 

Derivatization of identified anti-microbials 

15 In cases where the identified anti-microbials above might represent peptidal 

compunds, the in vivo effectiveness of such compounds may be advantageously 
enhanced by chemical modification using the natural polypeptide as a starting point 
and incorporating changes that provide advantages for use, for example, increased 
stability to proteolytic degradation, reduced antigenicity, improved tissue penetration, 

20 and/or improved delivery characteristics. 

In addition to active modifications and derivative creations, it can also be 
useful to provide inactive modifications or derivatives for use as negative controls or 
introduction of immunologic tolerance. For example, a biologically inactive 
derivative which has essentially the same epitopes as the corresponding natural 

25 antimicrobiai can be used to induce immunological tolerance in a patient being 

treated. The induction of tolerance can then allow uninterrupted treatment with the 
active anti-microbiai to continue for a significantly longer period of time. 

Modified anti-microbiai polypeptides and derivatives can be produced using a 
number of different types of modifications to the amino acid chain. Many such 

30 methods are known to those skilled in the art. The changes can include, for example, 
reduction of the size of the molecule, and/or the modification of the amino acid 
sequence of the molecule. In addition, a variety of different chemical modifications of 
the naturally occurring polypeptide can be used, either with or without modifications 
to the amino acid sequence or size of the molecule. Such chemical modifications can, 

35 for example, include the incorporation of modified or non-natural amino acids owtdh- 
amino acid moieties during synthesis of the peptide chain, or the post-synthesis 
modification of incorporated chain moieties. 
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The oligopeptides of this invention can be synthesized chemically or through 
an appropriate gene expression system. Synthetic peptides can include both naturally 
occurring amino acids and laboratory synthesized, modified amino acids. 

Also provided herein are functional derivatives of anti-microbial proteins or 
5 polypeptides. By "functional derivative" is meant a "chemical derivative," 

"fragment," "variant," "chimera," or "hybrid" of the polypeptide or protein, which 
terms are defined below. A functional derivative retains at least a portion of the 
function of the protein, for example reactivity with a specific antibody, enzymatic 
activity or binding activity. 

10 A "chemical derivative" of the complex contains additional chemical moieties 

not normally a part of the protein or peptide. Such moieties may improve the 
molecule's solubility, absorption, biological half-life, and the like. The moieties may 
alternatively decrease the toxicity of the molecule, eliminate or attenuate any 
undesirable side effect of the molecule, and the like. Moieties capable of mediating 

1 5 such effects are disclosed in Alfonso and Gennaro ( 1 995). Procedures for coupling 
such moieties to a molecule are well known in the art. Covalent modifications of the 
protein or peptides are included within the scope of this invention. Such 
modifications may be introduced into the molecule by reacting targeted amino acid 
residues of the peptide with an organic derivatizing agent that is capable of reacting 

20 with selected side chains or terminal residues, as described below. 

Cysteinyl residues most commonly are reacted with aipha-haloacetates (and 
corresponding amines), such as chloroacetic acid or chloroacefamide, to give 
carboxymethyl or carboxyamidomethyl derivatives. Cysteinyl residues also are 
derivatized by reaction with bromotrifluoroacetone, chloroacetyl phosphate, N- 

25 alkylmaleimides, 3-nitro-2-pyridyl disulfide, methyl 2-pyridyl disulfide, p-chloro- 
mercuribenzoate, 2-chloromercuri-4-nitrophenol, or chloro-7-nitrobenzo-2-oxa- 1,3- 
diazole. 

Histtdyl residues are derivatized by reaction with diethylprocarbonate at pH 
5.5-7.0 because this agent is relatively specific for the histidyl side chain. Para- 
30 bromophenacyl bromide also is useful; the reaction is preferably performed in 0.1 M 
sodium cacodylate at pH 6.0. 

Lysinyl and amino terminal residues are reacted with succinic or other 
carboxylic acid anhydrides. Derivatization with these agents has the effect of 
reversing the charge of the lysinyl residues. Other suitable reagents for derivatizing 
35 primary amine- containing residues include imidoesters such as methyl 
picolinimidate; pyridoxal phosphate; pyridoxal; chloroborohydride; 
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trinitrobenzenesulfonic acid; O-methylisourea; 2,4 pentanedione; and transaminase- 
catalyzed reaction with glyoxyiate. 

Arginyl residues are modified by reaction with one or several conventional 
reagents, among them phenylglyoxal, 2,3-butanedione, 1,2-cyclohexanedione, and 
5 ninhydrin. Derivatization of arginine residues requires that the reaction be performed 
in alkaline conditions because of the high pK, of the guanidine functional group. 
Furthermore, these reagents may react with the groups of lysine as well as the arginine 
alpha-amino group. 

Tyrosyl residues are well-known targets of modification for introduction of 

1 0 spectral labels by reaction with aromatic diazonium compounds or tetranitromethane. 
Most commonly, N-acetylimidizol and tetranitromethane are used to form O acetyl 
tyrosyl species and 3-nitro derivatives, respectively. 

Carboxyl side groups (aspartyl or glutamyl) are selectively modified by 
reaction carbodiimide (R'-N-C-N-R') such as l-cyclohexyl-3-{2-morpholinyl(4-ethyl) 

15 carbodiimide or l-ethyl-3-(4-azonia-4,4-dimethylpentyl) carbodiimide. Furthermore, 
aspartyl and glutamyl residues are converted to asparaginyl and glutaminyl residues 
by reaction with ammonium ions. 

Glutaminyl and asparaginyl residues are frequently deamidated to the 
corresponding glutamyl and aspartyl residues. Alternatively, these residues are 

20 deamidated under mildly acidic conditions. Either form of these residues falls within 
the scope of this invention. 

Derivatization with bifunctional agents is useful, for example, for cross- 
linking component peptides to each other or the complex to a water-insoluble support 
matrix or to other macromolecular carriers. Commonly used cross-linking agents 

25 include, for example, 1 , 1 -bis (diazoacetyl)-2-phenylethane, glutaraldehyde, N- 

hydroxysuccinimide esters, for example, esters with 4-azidosalicylic acid, homobi- 
functional imidoesters, including disuccinimidyl esters such as 3,3- 
ditmobis(su(xiniinidyrpropionate), and bifunctional maleimides such as bis-N- 
maleimido-l,8-octane. Derivatizing agents such as methyl-3-[p-azidophenyl) 

30 dithiolpropioimidate yield photoactivatable intermediates that are capable of forming 
crosslinks in the presence of light. Alternatively, reactive water-insoluble matrices 
such as cyanogen bromide-activated carbohydrates and the reactive substrates 
described in U.S. Patent Nos. 3,969,287; 3,691,016; 4,195,128; 4,247,642; 4,229,537; 
and 4,330,440 are employed for protein immobilization. 

35 Other modifications include hydroxylation of proline and lysine, 

phosphorylation of hydroxyl groups of seryl or threonyl residues, methylation of the 
alpha-amino groups of lysine, arginine, and histidine side chains (Creighton, T.E., 
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Proteins: Structure and Molecular Properties, W.H. Freeman & Co., San Francisco, 
pp. 79-86 (1983)), acetylation of the N-terminal amine, and, in some instances, 
amidation of the C-terminal carboxyi groups. 

Such derivatized moieties may improve the stability, solubility, absorption, 
5 biological half life, and the like. The moieties may alternatively eliminate or attenuate 
any undesirable side effect of the protein complex. Moieties capable of mediating 
such effects are disclosed, for example, in Alfonso and Gennaro (1995). 

The term "fragment" is used to indicate a polypeptide derived from the amino 
acid sequence of the protein or polypeptide having a length less than the full-length 

10 polypeptide from which it has been derived. Such a fragment may, for example, be 
produced by proteolytic cleavage of the full-length protein. Preferably, the fragment 
is obtained recombinant^ by appropriately modifying the DNA sequence encoding 
the proteins to delete one or more amino acids at one or more sites of the C-terminus, 
N-terminus, and/or within the native sequence. 

1 5 Another functional derivative intended to be within the scope of the present 

invention is a "variant" polypeptide that either lacks one or more amino acids or 
contains additional or substituted amino acids relative to the native polypeptide. The 
variant may be derived from a naturally occurring polypeptide by appropriately 
modifying the protein DNA coding sequence to add, remove, and/or to modify codons 

20 for one or more amino acids at one or more sites of the C-terminus, N-terminus, 
and/or within the native sequence. 

A functional derivative of a protein or polypeptide with deleted, inserted 
and/or substituted amino acid residues may be prepared using standard techniques 
well-known to those of ordinary skill in the art. For example, the modified 

25 components of the functional derivatives may be produced using site-directed 
mutagenesis techniques (as exemplified by Adelman et aL, 1983, DNA 2:183; 
Sambrook et al., 1989) wherein nucleotides in the DNA coding sequence are modified 
such that a modified coding sequence is produced, and thereafter expressing this 
recombinant DNA in a prokaryotic or eukaryotic host cell, using techniques such as 

30 those described above. Alternatively, components of functional derivatives of 
complexes with amino acid deletions, insertions and/or substitutions may be 
conveniently prepared by direct chemical synthesis, using methods well-known in the 
art. 

Insofar as other anti-mi crobial inhibitor compounds identified by the invention 
35 described herein may not be peptidal in nature, other chemical techniques exisftd 
allow their suitable modification, as well, and according the desirable principles 
discussed above. 
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Administration and Pharmaceutical Compositions 

For the therapeutic and prophylactic treatment of infection, the preferred 
method of preparation or administration of anti-microbial compounds will generally 
5 vary depending on the precise identity and nature of the anti-microbial being 

delivered. Thus, those skilled in the art will understand that administration methods 
known in the art will also be appropriate for the compounds of this invention. 

The particularly desired anti-microbial can be administered to a patient either 
by itself, or in pharmaceutical compositions where it is mixed with suitable carriers or 

10 excipient(s). In treating an infection, a therapeutically effective amount of an agent or 
agents is administered. A therapeutically effective dose refers to that amount of the 
compound that results in amelioration of one or more symptoms of bacterial infection 
and/or a prolongation of patient survival or patient comfort. 

Toxicity, therapeutic and prophylactic efficacy of anti-microbials can be 

1 5 determined by standard pharmaceutical procedures in cell cultures and/or 

experimental organisms such as animals, e.g., for detennining the LD 50 (the dose 
lethal to 50% of the population) and the ED S0 (the dose therapeutically effective in 
50% of the population). The dose ratio between toxic and therapeutic effects is the 
therapeutic index and it can be expressed as the ratio LDjp/ED^. Compounds that 

20 exhibit large therapeutic indices are preferred. The data obtained from these ceU 

culture assays and animal studies can be used in formulating a range of dosage for use 
in humans. The dosage of such compounds lies preferably within a range of 
circulating concentrations that include the ED 50 with little or no toxicity. The dosage 
may vary within this range depending upon the dosage form employed and the route 

25 of administration utilized. 

For any compound identified and used in the method of the invention, the 
therapeutically effective dose can be estimated initially from cell culture assays. Such 
information can be used to more accurately determine useful doses in organisms such 
as plants and animals, preferably mammals, and most preferably humans. Levels in 

30 plasma may be measured, for example, by HPLC or other means appropriate for 
detection of the particular compound. 

The exact formulation, route of administration and dosage can be chosen by 
the individual physician in view of the patient's condition (see e.g. Fingl et. a)., in The 
Pharmacological Basis of Therapeutics, 1975, Ch. 1 p.l). 

35 It should be noted that the attending physician would know how" and when" to 

terminate, interrupt, or adjust administration due to toxicity, organ dysfunction, or 
other systemic malady. Conversely, the attending physician would also know to 
adjust treatment to higher levels if the clinical response were not adequate (precluding 
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toxicity). The magnitude of an administered dose in the management of the disorder 
of interest will vary with the severity of the condition to be treated and the route of 
administration. The severity of the condition may, for example, be evaluated, in part, 
by standard prognostic evaluation methods. Further, the dose and perhaps dose 
5 frequency, will also vary according to the age, body weight, and response of the 
individual patient. A program comparable to that discussed above also may be used 
in veterinary or phyto medicine. 

Depending on the specific infection target being treated and the method 
selected, such agents may be formulated and administered systemically or locally, i.e., 

10 topically. Techniques for formulation and administration may be found in Alfonso 
and Gennaro (1995). Suitable routes may include , for example, oral, rectal, 
transdermal, vaginal, transmucosal, intestinal, parenteral, intramuscular, 
subcutaneous, or intramedullary injections, as well as intrathecal, intravenous, or 
intraperitoneal injections. 

1 5 For injection, the agents of the invention may be formulated in aqueous 

solutions, preferably in physiologically compatible buffers such as Hanks' solution, 
Ringer's solution, or physiological saline buffer. For transmucosal adrninistration, 
penetrants appropriate to the barrier to be permeated are used in the formulation. 
Such penetrants are generally known in the art. 

20 Use of pharmaceutical^ acceptable carriers to formulate identified anti- 

microbials of the present invention into dosages suitable for systemic administration is 
within the scope of the invention. With proper choice of carrier and suitable 
manufacturing practice, the compositions of the present invention, in particular those 
formulated as solutions, may be administered parenterally, such as by intravenous 

25 injection. Appropriate compounds can be formulated readily using pharmaceutically 
acceptable carriers well known in the art into dosages suitable for oral administration. 
Such carriers enable the compounds of the invention to be formulated as tablets, pills, 
capsules, liquids, gels, syrups, slurries, suspensions and the like, for oral ingestion by 
a patient to be treated. 

30 Agents intended to be administered intracellularly may be administered using 

techniques well known to those of ordinary skill in the art. For example, such agents 
may be encapsulated into liposomes, then administered as described above. 
Liposomes are spherical lipid bilayers with aqueous interiors. All molecules present 
in an aqueous solution at the time of liposome formation are incorporated into the 

3 5 aqueous interior. The liposomal contents are both protected from the external ~~ 
microenvironment and, because liposomes fuse with cell membranes, are efficiently 
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delivered into the cell cytoplasm. Additionally, due to their hydrophobicity, small 

organic molecules may be directly administered intracellularly. 

Pharmaceutical compositions suitable for use in the present invention include 

compositions wherein the active ingredients are contained in an effective amount to 
5 achieve the intended purpose. Determination of the effective amounts is well within 

the capability of those skilled in the art. 

In addition to the active ingredients, these pharmaceutical compositions may 

contain suitable pharmaceutically acceptable carriers comprising excipients and 

auxiliaries which facilitate processing of the active compounds into preparations 
10 which can be used pharmaceutically. The preparations formulated for oral 

administration may be in the form of tablets, dragees, capsules, or solutions, including 

those formulated for delayed release or only to be released when the pharmaceutical 

reaches the small or large intestine. 

The pharmaceutical compositions of the present invention may be 
1 5 manufactured in a manner that is itself known, e.g. , by means of conventional mixing, 

dissolving, granulating, dragee-making, levitating, emulsifying, encapsulating, 

entrapping or lyophilizing processes. 

Pharmaceutical formulations for parenteral administration include aqueous 

solutions of the active anti-microbial compounds in water-soluble form. 
20 Alternatively, suspensions of the active compounds may be prepared as appropriate 

oily injection suspensions. Suitable lipophilic solvents or vehicles include fatty oils 

such as sesame oil, or synthetic fatty acid esters, such as ethyl oleate or triglycerides, 

or liposomes. Aqueous injection suspensions may contain substances which increase 

the viscosity of the suspension, such as sodium carboxymethyl cellulose, sorbitol, or 
25 dextran. Optionally, the suspension may also contain suitable stabilizers or agents 

which increase the solubility of the compounds to allow for the preparation of highly 

concentrated solutions. 

Pharmaceutical preparations for oral use can be obtained by combining the 

active compounds with solid excipient, optionally grinding a resulting mixture, and 
30 processing the mixture of granules, after adding suitable auxiliaries, if desired, to 

obtain tablets or dragee cores. Suitable excipients are, in particular, fillers such as 

sugars, including lactose, sucrose, mannitol, or sorbitol; cellulose preparations such 

as, for example, maize starch, wheat starch, rice starch, potato starch, gelatin, gum 

tragacanth, methyl cellulose, hydroxypropylmethyl-cellulose, sodium 
35 carboxymethylcellulose, and/or polyvinylpyrrolidone (PVP). If desired, - 

disintegrating agents may be added, such as the cross-linked polyvinyl pyrrolidone, 

agar, or alginic acid or a salt thereof such as sodium alginate. 
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Dragee cores are provided with suitable coatings. For this purpose, 
concentrated sugar solutions may be used, which may optionally contain gum arabic, 
talc, polyvinyl pyrrolidone, carbopol gel, polyethylene glycol, and/or titanium 
dioxide, lacquer solutions, and suitable organic solvents or solvent mixtures. 
5 Dyestuffs or pigments may be added to the tablets or dragee coatings for identification 
or to characterize different combinations of active compound doses. 

Pharmaceutical preparations which can be used orally include push-fit 
capsules made of gelatin, as well as soft, sealed capsules made of gelatin and a 
plasticizer, such as glycerol or sorbitol. The push-fit capsules can contain the active 

1 0 ingredients in admixture with filler such as lactose, binders such as starches, and/or 
lubricants such as talc or magnesium stearate and, optionally, stabilizers. In soft 
capsules, the active compounds may be dissolved or suspended in suitable liquids, 
such as fatty oils, liquid paraffin, or liquid polyethylene glycols. In addition, 
stabilizers may be added. 

1 5 The above methodologies may be employed either actively or prophy lactically 

against an infection of interest. 

Computer-related Aspects and Embodiments 

In addition to the provision of compounds as chemical entities, nucleotide 

20 sequences, or fragments thereof at least 95%, preferably at least 97%, more preferably 
at least 99%, and most preferably at least 99.9% identical to phage inhibitor sequences 
can also be provided in a variety of additional media to facilitate various uses. 

Thus, as used in this section, "provided" refers to an article of manufacture, 
rather than an actual nucleic acid molecule, which contains a nucleotide sequence of 

25 the present invention; e.g. , a nucleotide sequence of an exemplary bacteriophage or a 
sequence encoding a bacterial target or a fragment thereof, preferably a nucleotide 
sequence at least 95%, more preferably at least 99% and most preferably at least 
99.9% identical to such a bacteriophage or bacterial sequence, for example, to a 
polynucleotide of an unsequenced phage listed in Table 1, preferably of bacteriophage 

30 77 (S. aureus host) or bacteriophage 3 A (S.aureus host) or bacteriophage 96 (S. 

aureus host). Such an article provides a large portion of the particular bacteriophage 
genome or bacterial gene and parts thereof (e.g., a bacteriophage open reading frame 
(ORF)) in a form which allows a skilled artisan to examine and/or analyze the 
sequence using means not directly applicable to examining the actual genome or gene r , 

35 or subset thereof as it exists in nature or in purified form as a chemical entityt 
In one application of this aspect, a nucleotide sequence of the present 
invention can be recorded on computer readable media. As used herein, "computer 
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readable media" refers to any medium that can be read and accessed directly by a 
computer. Such media include, but are not limited to: magnetic storage media, such 
as floppy discs, hard disc storage medium, magnetic tape; optical storage media such 
as CD-ROM; electrical storage media such as RAM and ROM; and hybrids of these 
5 categories, such as magnetic/optical storage media. A skilled artisan can readily 
appreciate how any of the presently known computer readable mediums can be used 
to create an article of manufacture which includes one or more computer readable 
media having recorded thereon a nucleotide sequence or sequences of the present 
invention. Likewise, it will be clear to those of skill how additional computer 

1 0 readable media that may be developed also can be used to create analogous 

manufactures having recorded thereon a nucleotide sequence of the present invention. 

As used herein, "recorded" refers to a process for storing information on 
computer readable medium. A skilled artisan can readily adopt any of the presently 
known methods for recording information on computer readable medium to generate 

1 5 manufactures comprising the nucleotide sequence information of the present 
invention. 

A variety of data storage structures are available to a skilled artisan for 
creating a computer readable medium having recorded thereon a nucleotide sequence 
of the present invention. The choice of the data storage structure will generally be 

20 based on the means chosen to access the stored information. In addition, a variety of 
data processor programs and formats can be used to store the nucleotide sequence 
information of the present invention on computer readable medium. The sequence 
information can, for example, be presented in a word processing test file, formatted in 
commercially available software such as WordPerfect and Microsoft Word, or 

25 represented in the form of an ASCII file, stored in a database application, such as 
DB2, Sybase, Oracle, or the like. A skilled artisan can readily adapt any number of 
data processor structuring formats {e.g., text file or database) in order to obtain 
computer readable medium having recorded thereon the nucleotide sequence 
information of the present invention. 

30 Computer software is publicly available which allows a skilled artisan to 

access sequence information provided in a computer readable medium. Thus, by 
providing in computer readable form a nucleotide sequence of an unsequenced 
bacteriophage, such as an exemplary bacteriophage listed in Table i or of a sequence 
encoding a bacterial target or a fragment thereof, preferably a nucleotide sequence at - - 

35 least 95%, more preferably at least 99% and most preferably at least 99.9% identical 
to such a bacteriophage or bacterial sequence, for example, to a polynucleotide of 
bacteriophage 77 (S. aureus host) or bacteriophage 3 A (S. aureus host) bacteriophage 
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96 (S. aureus host), bacteriophage 44AHJD (S. aureus host), bacteriophage Dp- 1 
{Streptococcus pneumoniae host), or bacteriophage 1 82 (Enterococcus host) the 
present invention enables the skilled artisan to routinely access the provided sequence 
information for a wide variety of purposes. 
5 Those skilled in the art understand that software can implement a variety of 

different search or analysis software which implement sequence search and analysis 
algorithms, e.g., the BLAST (Altschul et al., J. Mol. Biol. 215:403410 (1990) and 
BLAZE (Brutlag et al., Comp. Chem 17:203-207 (1993)) search algorithms. For 
example, such search algorithms can be implemented on a Sybase system and used to 
1 0 identify open reading frames (ORFs) within the bacteriophage genome which contain 
homology to ORFs or proteins from other viruses, e.g, other bacteriophage, and other 
organisms, e.g., the host bacterium. Among the ORFs discussed herein are protein 
encoding fragments of the bacteriophage genomes which encode bacteria-inhibiting 
proteins or fragments. 

1 5 The present invention further provides systems, particularly computer-based 

systems, which contain the sequence information described. Such systems are 
designed to identify, among other things, useful fragments of the bacteriophage 
genomes. 

As used herein, "a computer-based system" refers to the hardware, software, 

20 and data storage media used to analyze the nucleotide sequence information of the 
present invention. The minimum hardware of the computer-based systems of the 
present invention comprises a central processing unit (CPU), input device, output 
device, and data storage medium or media. A skilled artisan will readily recognize 
that any of the currently available general purpose computer-based system are suitable 

25 for use in the present invention, as well as a variety of different specialized or 
dedicated computer-based systems. 

As stated above, the computer-based systems of the present invention 
comprise data storage media having stored therein a nucleotide sequence of the 
present invention and the necessary hardware and software for supporting and 

30 implementing a search and/or analysis program. 

As used herein, "data storage media" refers to memory which can store 
nucleotide sequence information of the present invention, or a memory access means 
which can access manufactures having recorded thereon the nucleotide sequence 
information of the present invention. 

35 As used herein, "search program" refers to one or more programs which are 

implemented on the computer-based system to compare a target sequence or target 
structural motif with the sequence information stored within the data storage means. 
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Search means are used to identify fragments or regions of the present gnomic 
sequences which match a particular target sequence or target motif. A variety of 
known algorithms are disclosed publicly and a variety of commercially available 
software for conducting search means are and can be used in the computer-based 
5 systems of the present invention. Examples of such software includes, but is not 
limited to, MacPattem (EMBL), BLASTN and BLASTX (NCBIA). A skilled artisan 
can readily recognize that any one of the available algorithms or implementing 
software packages for conducting homology searches and/or sequence analyses can be 
adapted for use in the present computer-based systems. 

1 0 As used herein in connection with sequence searches and analyses, a "target 

sequence" can be any DNA or amino acid sequence of six or more nucleotides or two 
or more amino acids. A skilled artisan can readily recognize that the longer a target 
sequence is, the less likely a target sequence will be present as a random occurrence in 
the database. Also, the target sequence length is preferably selected to include 

1 5 sequence corresponding to a biologically relevant portion of an encoded product, for 
example a region which is expected to be conserved across a range of source 
organisms. Preferably the sequence length of a target polypeptide sequence is from 5- 
100 amino acids, more preferably 7-50 or 7-100 amino acids, and still more preferably 
10-80 or 10-100 amino acids. Preferably the sequence length of a target 

20 polynucleotide sequence is from 1 5-300 nucleotide residues, more preferably from 21- 
240 or 21-300, and still more preferably 30-150 or 30-300 nucleotide residues. 
However, it is well recognized that searches for commercially important fragments, 
such as sequence fragments involved in gene expression and protein processing, may 
be of shorter length. Likewise, it may be desirable to search and/or analyze longer 

25 sequences. 

As used herein, "a target structural motif," or "target motif," refers to any 
rationally selected sequence or combination of sequences in which the sequences) are 
chosen based on a three-dimensional configuration which is formed upon the folding 
of the target motif. There are a variety of target motifs known in the art. Protein 
30 target motifs include, but are not limited to, enzymatic active sites and signal 
sequences. Nucleic acid target motifs include, but are not limited to promoter 
sequences, hairpin structures and inducible expression elements (protein binding 
sequences). 

A variety of structural formats for the input and output devices can be used to 
35 input and output the information in the computer-based systems of the pres$nr~ 
invention. A preferred format for an output device ranks fragments of the 
bacteriophage or bacterial sequences possessing varying degrees of homology to the 
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target sequence or target motif. Such presentation provides a skilled artisan with a 
ranking of sequences which contain various amounts of the target sequence or target 
motif and identifies the degree of homology contained in the identified fragment, 

A variety of comparing methods and/or devices and/or formats can be used to 
5 compare a target sequence or target motif with the sequence stored in data storage 
media to identify sequence fragments of the bacteriophage or bacterium in question. 
One skilled in the art can readily recognize that any one of the publicly available 
homology search programs can be used as the search program for the computer-based 
systems of the present invention. Of course, suitable proprietary systems that may be 

10 known to those of skill, or later developed, also may be employed in this regard. 

Figure 6 provides a block diagram of a computer system illustrative of 
embodiments of this aspect of present invention. The computer system 102 includes a 
processor 106 connected to a bus 1 04. Also connected to the bus 1 04 are a main 
memory 108 (preferably implemented as random access memory, RAM) and a variety 

1 5 of secondary storage devices 1 1 0, such as a hard drive 1 12 and a removable medium 
storage device 1 14. The removable medium storage device 1 14 may represent, for 
example, a floppy disk drive, a CD-ROM drive, a magnetic tape drive, etc. A 
removable storage medium 1 16 (such as a floppy disk, a compact disk, a magnetic 
tape, etc.) containing control logic and/or data recorded therein may be inserted into 

20 the removable medium storage device 1 1 4. The computer system 1 02 includes 

appropriate software for reading the control logic and/or the data from the removable 
medium storage device 1 14, once it is inserted into the removable medium storage 
device 1 14. 

A nucleotide sequence of the present invention may be stored in a well-known 
25 manner in the main memory 108, any of the secondary storage devices 1 1 0, and/or a 
removable storage medium 116. During execution, software for accessing and 
processing the sequence (such as search tools, comparing tools, etc.) reside in main 
memory 108, in accordance with the requirements and operating parameters of the 
operating system, the hardware system and the software program or programs. 
30 The data storage medium in which the sequence is embodied and the central 

processor need not be part of a single stand-alone computer, but may be separated so 
long as data transfer can occur. For example, the processor or processors being 
utilized for a search or analysis can be part of one general purpose computer, and the 
data storage medium can be part of a second general purpose computer connected to 
35 network, or the data storage medium can be part of a network server. As another 
example the data storage medium can be part of a computer system or network 
accessible over telephone lines or other remote connection method. 
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EXAMPLES 

Example 1 . Growth of Staph A bacteriophage 77 and purification of genomic 

m 

5 The Staphylococcus aureus propagating strain (PS 77; ATCC #27699) was 

used as a host to propagate its respective phage 77 (ATCC # 27699-B 1 ). Two rounds 
of plaque purification of phage 77 were performed on soft agar essentially as 
described in Sambrook et al (1989). Briefly, the PS 77 strain was grown overnight at 
37°C in Nutrient broth [NB: 0.3% Bacto beef extract, 0.5% Bacto peptone (Difco 

10 Laboratories) and 0.5% NaCl (w/v)].The culture was then diluted 20x in NB and 
incubated at 37°C until the OD 340 = .2 (early log phase) with constant agitation. In 
order to obtain single plaques, phage 77 was subjected to 10-fold serial dilutions using 
phage buffer (I mM MgS0 4 , 5 mM MgCl 3 , 80 mM NaCl and 0.1% Gelatin (w/v)) and 
10 ul of each dilution was used to infect 0.5 ml of the cell suspension in the presence 

1 5 of 400 ng/ml CaCl 2 . After incubation of 1 5 min at room temperature (RT), 2 ml of 
melted soft agar kept at 45°C (NB supplemented with 0.6% agar) was added to the 
mixture and poured onto the surface of 100 mm nutrient agar plates (0.3% Bacto Beef 
extract, 0.5% Bacto peptone, 0.5% NaCl and 1.5% Bacto agar (w/v)). After overnight 
incubation at 30°C, a single plaque was isolated, ^suspended in 1 ml of phage buffer 

20 by end over end rotation for 2 hrs at 20°C, and the phage suspension was diluted and 
used for a second infection as described above. After overnight incubation at 30°C, a 
single plaque was isolated and used as a stock. 

The propagation procedure for bacteriophage 77 was modified from the agar 
layer method of Swanst6rm and Adams (1951). Briefly, the PS 77 strain was grown to 

25 stationary phase overnight at 37°C in Nutrient broth. The culture was then diluted 
twenty-fold in NB and incubated at 37°C until the OD^ .2. The suspension (15xl0 7 
Bacteria) was then mixed with 15x10 s plaque forming units (pfu) to give a ratio of 
100-bacteria/phage particle in the presence of 400 ug/mi of CaCi 2 . After incubation 
for 15 min at 20°C, 7.5 ml of melted soft agar (NB plus 0.6% agar) were added to the 

30 mixture and poured onto the surface of 1 50 mm nutrient agar plates and incubated 1 6 
hrs at 30°C, To collect the phage plate lysate, 20 ml of NB were added to each plate 
and the soft agar layer was collected by scrapping off with a clean microscope slide 
followed by shaking of the agar suspension for 5 min to break up the agar. The 
mixture was then centrifuged for 10 min at 4,000 RPM (2,830xg) in a JA-10 rotor- * ' 

35 (Beckman) and the supernatant fluid (lysate) was collected and subjected toll 

treatment with 10 u-g /ml of DNase I and RNase A for 30 min at 37*C. To precipitate 
the phage particles, the phage suspension was adjusted to 1 0% (w/v) PEG 8000 and 
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0.5 M of NaCI followed by incubation at 4°C for 16 hrs. The phage was recovered by 
centrirugation at 4,000 rpm (3,500xg) for 20 min at 4°C on a GS-6R table top 
centrifuge (Beckman). The pellet was resuspended with 2 ml of phage buffer (1 mM 
MgS0 4 , 5 raM MgCl 2 , 80 mM NaCI and 0.1% Gelatin). The phage suspension was 
5 extracted with 1 volume of chloroform and further purified by centrirugation on a 
cesium chloride step gradient as described in Sambrook et al. (1989), using a TLS 55 
rotor centrifuged in an Optima TLX ultracentriruge (Beckman) for 2 h at 28,000 rpm 
(67,000xg) at 4°C. Banded phage was collected and ultracentrifuged again on an 
isopycnic cesium chloride gradient (1.45 g/ml) at 40,000 rpm (64 ,000xg) for 24 h at 

1 0 4°C using a TLV rotor (Beckman). The phage was harvested and dialyzed for 4 h at 
room temperature against 4 L of dialysis buffer consisting of 10 mM NaCI, 50 mM 
Tris-HCl [pH 8] and 10 mM MgCl 2 . Phage DNA was prepared from the phage 
suspension by adding 20 mM EDTA, 50 mg/ml Proteinase K and 0.5% SDS and 
incubating for 1 h at 65°C, followed by successive extractions with 1 volume of 

15 phenol, 1 volume of phenol-chloroform and I volume of chloroform. The DNA was 
then dialyzed overnight at 4°C against 4 L of TE (10 mM Tris pH 8.0, ImM EDTA). 

Example 2. DNA sequencing of Bacteriophage 77 genome 

Four micrograms of phage 77 DNA was diluted in 200 ul of TE (10 mM Tris, 
20 [pH 8.0], 1 mM EDTA) in a 1 .5 ml eppendorf tube and sonication was performed 
(550 Sonic Dismembrator™, Fisher Scientific). Samples were sonicated under an 
amplitude of 3 \sm with bursts of 5 s spaced by 1 5 s cooling in ice/water for 3 to 4 
cycles. The sonicated DNA was then size fractionated by electrophoresis on 1% 
agarose gels utilizing TAE (1 x TAE is: 40 mM Tris-acetate, 1 mM EDTA [pH 8.0]) 
25 as the running buffer. Fractions ranging from I to 2 kbp were excised from the 

agarose gel and purified using a commercial DNA extraction system according to the 
instructions of the manufacturer (Qiagen), with a final elution of 50 \ii of I mM Tris 
(PH8.5). 

The ends of the sonicated DNA fragments were repaired with a combination of 
30 T4 DNA polymerase and the Klenow fragment of E. coli DNA polymerase I, as 
follows. Reactions were performed in a reaction mixture (final volume, 100 pil) 
containing sonicated phage DNA, 10 mM Tris-HCl [pH 8.0], 50 mM NaCI, 10 mM 
MgCl 2 , 1 mM DTT, 50 yg/ml BSA, 100 jiM of each dNTP and 15 units of T4 DNA 
polymerase (New England Biolabs) for 20 min at 12°C followed by addition of 12.5 - , 
3 5 units of Klenow large fragment (N ew England Biolabs) for 1 5 min at room - 

temperature. The reaction was stopped by two phenol/chloroform extractions and the 
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DNA was precipitated with ethanol and the final DNA pellet was resuspended in 20 
ul ofH 2 0. 

Blunt-ended DNA fragments were cloned by ligation directly into the Hinc U 
site of pKSII+ vector (New England Biolabs) dephosphorylated by treatment with calf 
5 intestinal alkaline phosphatase (New England Biolabs)-treated pKS 11+ vector 

(Stratagene). A typical ligation reaction contained 100 ng of vector DNA, 2 to 5 \i\ of 
repaired sonicated phage DNA (50-100 ng) in a final volume of 20 ul containing 800 
units of T4 DNA ligase (New England Biolabs) and was incubated overnight at 16°C. 
Transformation and selection of bacterial clones containing recombinant plasmids was 
1 0 performed in E. coli DH 1 Op according to standard procedures (Sambrook et al., 
1989). 

Recombinant clones were picked from agar plates into 96-well plates 
containing 100 \x\ LB and 100 ug/ml ampicillin and incubated at 37°C. The presence 
of phage DNA insert was confirmed by PCR amplification using T3 and T7 primers 

15 flanking the Hinc U cloning site of the pKS 11+ vector. PCR amplification of foreign 
insert was performed in a 15 ul reaction volume containing 10 mM Tris (pH 8.3), 50 
mM KCl, 1.5 mM MgCl 2 , 0.02% gelatin, 1 uM primer, 187.5 uM each dNTP, and 
0.75 units Taq polymerase (BRL). The thermocycling parameters were as follows: 2 
min initial denaturation at 94°C for 2 min, followed by 20 cycles of 30 sec 

20 denaturation at 94°C, 30 sec annealing at 57°C, and 2 min extension at 72°C, 

followed by a single extension step at 72°C for 10 min. Clones with insert sizes of 1 
to 2 kbp were selected and plasmid DNA was prepared from the selected clones using 
QIAprep™ spin miniprep kit (Qiagen). 

The nucleotide sequence of the extremities of each recombinant clone was 

25 determined using an ABI 377-36 automated sequencer with two types of chemistry: 
ABI prism Big Dye™ primer or ABI prism Big Dye™ terminator cycle sequencing 
ready reaction kit (Applied Biosystems), To ensure co-linearity of the sequence data 
and the genome, all regions of phage genome were sequenced at least once from both 
directions on two separate clones. In areas that this criteria was not initially met, a 

30 sequencing primer was selected and phage DNA was used directly as sequencing 

template employing ABI prism Big Dye™ terminator cycle sequencing ready reaction 
kit. 

Example 3. Bioinformatic management of primary nucleotide sequence from 
35 Phage 77. * _ — ~ 

Phage 77 sequence contigs were assembled using Sequencher™ 3.1 software 
(GeneCodes). To close contig gaps, sequencing primers were selected near the edge of 
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the contigs. Phage DNA was used directly as sequencing template employing ABI 
prism BIG DYE™ terminator cycle sequencing ready reaction kit. The complete 
sequence of bacteriophage 77 is shown in Table 2. 

A software program was developed and used on the assembled sequence of 
5 bacteriophage 77 to identify all putative ORFs larger than 33 codons. Other ORF 
, identification software can also be utilized, preferably programs which allow 
alternative start codons. The software scans the primary nucleotide sequence starting 
at nucleotide #1 for an appropriate start codon. Three possible selections can be made 
for defining the nature of the start codon; I) selection of ATG, II) selection of ATG or 

10 GTG, and IH) selection of either ATG, GTG, TTG, CTG, ATT, ATC, and ATA. This 
latter initiation codon set corresponds to the one reported by the NCBI 
fhttp://www.ncbi.nlm.nih.gov/htbin-oost/Taxonomv/WT>rintgc?mode^c) for the 
bacterial genetic code. 

When an appropriate start codon is encountered, a counting mechanism is 

1 5 employed to count the number of codons (groups of three nucleotides) between this 
start codon and the next stop codon downstream of it. If a threshold value of 33 is 
reached, or exceeded, then the sequence encompassed by these two codons (start and 
stop codons) is defined as an ORF. This procedure is repeated, each time starting at 
the next nucleotide following the previous stop codon found, in order to identify all 

20 the other putative ORFs. The scan is performed on all three reading frames of both 
DNA strands of the phage sequence. 

Sequence homology (BLAST) searches for each ORF are then carried out 
using an implementation of BLAST programs, although any of a variety of different 
sequence comparison and matching programs can be utilized as known to those 

25 skilled in the art. Downloaded public databases used for sequence analysis include: 

i) non-redundant GenBank (ftp^/ncbi.nlm.nih.gov/blast/db/nr^), 

ii) Swissprot (ftp://ncbi.nlm.nih.gOv/blast/db/swissprot.Z); 

iii) vector (f^://ncbi.nlmjiih.gov/blast/db/vector.Z); 

iv) pdbaa databases (ftp://ncbi.nlrn.nih.g0v/bIast/db/pdbaa.Z); 

30 v) S. aureus NCTC 8325 (ftp://ftp.genome.ou.edii/pub/staplVstaph- 1 k,fa); 

vi) streptococcus pyogenes (ftp://frp.genome.ou.edu/pub/strep/strep-lk.fa); 

vii) Streptococcus pneumoniae 

(ftp://ftp.tigr.org/pub/data/s_j>neumoniae/gsp.contigs. 1 1 2 1 97.Z); 

viii) Mycobacterium tuberculosis CSU#9 
35 (ftp://ftp.ugr.org/pub/da^ 

ix) pseudomonas aeruginosa ( http://www.e enome.washington.edu/oseudo/data.htmn. 
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The results of the homology searches performed on the ORFs is shown in 
Table 5. 

Example 4, Sufrcjpnjng of Bacteriophage 77 ORFs in^o 3 Staph A inducible 

5 expression system. 

The shuttle vector pT0021, in which the firefly luciferase (lucFF) expression 
is controlled by the ars (arsenite) promoter/operator (Tauriainen et al., 1997), was 
modified in the following fashion. Two oligonucleotides corresponding to a short 
antigenic peptide derived from the heamaglutinin protein of influenza virus (HA 
10 epitope tag) were synthesized (Field et al., 1988). The sense strand HA tag sequence 
(with BamHl, Sail and Hindlll cloning sites) is: 

5 '-gatcccggtcgaccaagcttTACCCATACGACGTCCCAGACTACGCCAGCTGA-3 ' 
(where upper case letters denote the nucletotide sequence of the HA tag); the antisense 
strand HA tag sequence (with a HindUl cloning site) is: 

15 5 '-agctTCAGCTGGCGTAGTCTGGGACGTCGTATGGGTAaagcttggtcgaccgg-S 9 
(where upper case letters denote the sequence of the HA tag). The two HA tag 
oligonucleotides were annealed and ligated into pT0021 vector which had been 
digested with BamHl and HindRl. This manipulation resulted in replacement of the 
lucFF gene by the HA tag. This modified shuttle vector containing the arsenite 

20 inducible promoter, the arsR gene, and HA tag was named pTHA. A diagram 
outlining our modification of pT0021 to generate pTHA is shown in Fig. I A. 

Each ORF, encoded by Bacteriophage 77, larger than 33 amino acids and 
having a Shine-Dalgarno sequence upstream of the initiation codon was selected for 
functional analysis for bacterial inhibition. In total, 98 ORFs were selected and 

25 screened as detailed below. A list of these is presented in Table 3. Each individual 
ORF, from initiation codon to last codon (excluding the stop codon), was amplified 
from phage genomic DNA using the polymerase chain reaction (PCR). For PCR 
amplification of ORFs, each sense strand primer targets the initiation codon and is 
preceded by a BamHl restriction site ( s cgggatcc 3 ') and each antisense oligonucleotide 

30 targets the pentultimate codon (the one before the stop codon) of the ORF and is 

preceded by a Sal I restriction site ( y gcgtcgaccg 3 ). The PCR product of each ORF was 
gel purified and digested with BamHl and Safl. The digested PCR product was then 
gel purified using the Qiagen kit as described, ligated into BamHl and Sail digested 
pTHA vector, and used to transform E. coli bacterial strain DH10P(as described m . - , 

35 above). As a result of this manipulation, the HA tag is set inframe with the ORF and is 
positioned at the carboxy terminus of each ORF (pTHA/ORF clones). Recombinant 
pTHA/ORF clones were picked and their insert sizes were confirmed by PCR analysis 
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using primers flanking the cloning site. The names and sequences of the primers that 
were used for the PCR amplification were: HAF: 

'TATTATCCAAAACITGAACA 3 '; HAR: 'CGGTGGTATATCCAGTGATT 3 '. The 
sequence integrity of cloned ORFs was verified directly by DNA sequencing using 
5 primers HAF and HAR. In cases where verification of ORF sequence could not be 
achieved by one pass with the sequencing primers, additional internal primers were 
selected and used for sequencing. 

Staphylococcus aureus strain RN4220 (Kreiswirth et al., 1983) was used as a 
recipient for the expression of recombinant plasmids. Electoporation was performed 

10 essentially as previously described (Schenk and Laddaga, 1 992). Selection of 

recombinant clones was performed on Luria-Broth agar (LB-agar) plates containing 
30 ug/ml of kanamycin. 

For each ORF introduced in the pTHA plasmid, 3 independent transformants 
were isolated and used to individually inoculate cultures in 5 ml of TSB containing 

1 5 30ug/ml kanamycin, followed by growth to saturation (1 6 hrs at 30°C). An aliquot of 
this stationary phase culture was used to generate a frozen glycerol stock of the 
transformant ( stored at - 80°C). The remaining culture was used for plasmid DNA 
extraction. Bacterial cells were harvested by centrifugation at 3000 x g at 22°C for 5 
min. The pellet was resuspended in 200 ul 25% sucrose containing 25U/ml of 

20 lysostaphin and incubated for 15 min at 37°C. Then, 400ul of alkaline SDS solution 
(3% SDS, 0.2N NaOH) were added, well mixed and incubated for 7 min at room 
temperature. After the alkaline SDS treatment, 300ul of ice-cold 3M sodium acetate 
pH 4.8 were added, and the mix is immediately spun at 13000g for 15 min at room 
temperature. The supernatant was transferred to a new 1 .5 ml conical centrifuge tube 

25 and 650ul of isopropanol (stored at room temperature) were added. The mix was then 
centrifuged at 13,000 x g for 5 min. The supernatant fluid was discarded, the pellet 
washed with 70% ethanol, and resuspended in 320 ul sterile distilled water. 

The presence of individual phage 77 ORF DNA inserts in the plasmid was 
verified by PCR amplification using 1 .5 ul transformant miniprep DNA in a PCR 

30 with primers flanking the cloning site of ORF in pTHA vector (HAF and HAR). The 
composition of the PCR reaction and the cycling parameters are identical to those 
employed for library screening described above. 

Example 5. Functional assay for bacteri al inhibitory activity of bacteriophage 77 
35 ORFs. 

The anti-microbial activity of individual phage 77 ORFs was monitored by 
two growth inhibitory assays, one on solid agar medium, the other in liquid medium. 
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In general, Staphylococcus bacteria transformed with expression plasmids containing 
individual ORFs were grown in normal TSA medium and stored in 19% glycerol. At 
pre-determined times, arsenite was added to the culture to induce transcription of the 
phage 77 ORFs cloned immediately downstream from an arsenite-inducible promoter 
5 in the pTHA expression plasmid. 

The effect of ORF induction on bacterial growth characteristics was then 
monitored and quantitated. The growth inhibition assay on solid medium was 
performed by streaking pTHA/ORF containing S. aureus transform ant onto LB-Kn 
and TSA-Kn plates containing increasing concentrations of sodium arsenite (0; 2.5; 5; 

10 and 7.5 \iM). Arsenite is used to induce the expression of cloned DNA in pTHA 
vector. In parallel, 3 ul of 1/10 and 1/100 dilutions of the frozen cultures of the 
pTHA/ORF transformants were spotted as single drops onto LB-Kn and TSA-Kn 
plates containing increasing concentration of sodium arsenite (0; 2.5; 5; and 7.5 uM). 
The plates were then incubated 16 hrs at 37°C, and the effect of arsenite-induced ORF 

1 5 expression on bacterial growth was monitored and quantitated by comparing the 
extent to that seen in control plates. As positive controls for growth inhibition,the 
holin/lysin genes of the Staphylococcus aureus phage Twort (Loessner et al., 1998) 
was subcloned into the pTHA ars inducible vector and used. 

For the growth inhibition assay in liquid medium, stationary phase cultures 

20 were prepared by inoculating 2.5ml TSB-Kn with frozen S. aureus RN4220 
transformants containing phage 77 ORFs cloned in pTHA vector followed by 
incubation for 16 hrs at 37°C. These cultures were then diluted 1/100 in the same 
medium, and the bacteria were allowed to grow for 2 hrs at 37°C to reach early log 
phase. 1 50 \i\ of such culture were then mixed with 2.35 ml TSB-Kn medium with or 

25 without arsenite (the final concentration of arsenite in the medium was 0 or 5 |iM 
arsenite). After 3.5 hrs incubation at 37°C with shaking at 250 rpm, 100 ul of 
bacterial culture was removed from each tube for OD 565 measurement. Serial ten-fold 
dilutions of the culture in buffered saline solution (0.85% NaCl) were then spotted 
onto TSB-Kn plates. The plates were incubated at 37°C 16 hrs and the number of 

30 surviving colonies counted the following day. The growth inhibitory property of 
individual ORFs was then quantitated by comparing CFU numbers under normal or 
arsenite-induction conditions. A schematic flow of the inhibition analysis is shown in 
Fig. 3 (also applicable to inhibition analysis for the other phage and bacteria pointed 
out herein). Inhibition results are shown in Figures 4A-C. 

35 

Example 6: rtentification of Cecronin Signature Motif in Staphylococcus aureus 
Bacteriophage 3 A QRF 
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The genome for S. aureus bacteriophage 3 A was determined and the sequence 
was analyzed essentially as described for bacteriophage 77 in the examples above. 
Upon blast analysis of the identified open reading frames of phage 3 A, the presence of 
an amino acid sequence corresponding to a cecropin signature motif was observed. 
5 This motif (WDGHKTLEK) is located at position aa 48 1 -489. Cecropins were 
originally identified in proteins from the cecropia moth and are recognized as potent 
antibacterial proteins that constitute an important part of the ceil-free immunity of 
insects. Cecropins are small proteins (31-39 amino acid residues) that are active 
against both Gram-positive and Gram-negative bacteria by disrupting the bacterial 
1 0 membranes. Although the mechanisms by which the cecropons cause cell death are 
not fully understood, it is generally thought to involve channel formation and 
membrane destabilization. 

The identification of a motif corresponding to a known inhibitor suggests that 
the product of ORF002 is also an inhibitory compound. Such inhibitory activity can 
1 5 be confirmed as described herein or by other methods known in the art. Confirmation 
of the inhibitory activity would indicate that the ORF product could serve as the basis 
for construction of mimetic compounds and other inhibitors directed to the target of 
the ORF002 product 

Boman & Hultmark, 1987, Ann. Rev. Microbiol. 41:103-126. 
20 Boman, 1991, Cell 65:205-207. 

Boman et al., 1991, Eur. J. Bioichem. 201:23-31. 

Wang et al., J. Biol Chem. 273:27438-27448. 

Example 7. Growth of Staphylococcus aureus bacteriophage 44AHJD: 
25 Staphylococcus aureus propagating strain (PS 44A) (Felix d'Herelle Reference 

Centre #HER 1101) was used as a host to propagate its respective phage 44AHJD 
(Felix d'Herelle Reference Centre #HER 101). Two rounds of plaque purification of 
phage 44AHJD were performed on soft agar essentially as described in Sambrook et 
al. (1989). Briefly, the Staphylococcus aureus PS strain was grown overnight at 37°C 
30 in Nutrient Broth [NB: 3 g Bacto Beef Extract, 5 g Bactopeptone per liter, (Difco 
Laboratories # 0003-17-8), supplemented with 0.5% NaCl]. The culture was then 
diluted 20 fold in NB and incubated at 37°C until an OD S40 of 0.2. In order to obtain 
single plaques, phage 44AHJD was subjected to 10- fold serial dilutions using the 
phage buffer (1 mM MgSO<, 5 mM MgCl 2 , 80 mM NaCl and 0.1% Gelatin) ana'lo'ui 
35 were used to infect 0.5 ml of the cell suspension in the presence of 400 ^ig/ml of 
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CaCl 2 . After incubation of! 5 min at room temperature, 2 ml of melted soft agar (NB 
supplemented with 0.6% of agar) were added to the mixture and poured onto the 
surface oftOO mm nutrient agar plates (3 g Bacto Beef extract, 5 g Bactopeptone, 
0.5% NaCl and 15 g of Bacto agar per liter (Difco Laboratories # 0001-17-0). After 
5 overnight incubation at 37°C, a single plaque was isolated, resuspended in 1ml of 
phage buffer by end over end rotation for 2 h at room temperature and the phage 
suspension was diluted and used for a second infection as described above. After 
overnight incubation at 37°C, a single plaque was isolated and used as a stock. 

Large scale purification of bacteriophage and preparation of phage DNA was 
10 as follows. 

The propagation method was carried out by using the agar layer method 
described by Swanstdrm and Adams (1951). Briefly, the PS 44A strain was grown to 
stationary phase overnight at 37°C in Nutrient Broth. The culture was then diluted 20x 
in NB and incubated at 37°C until the A^ 0.2. The suspension (I5xl0 ? Bacteria) 

1 5 was then mixed with 1 5x 1 0 5 phage particles to give a ratio of 1 00-bacteria/phage 
particle in the presence of 400 ug/ml of CaCl 2 . After incubation of 15 min at room 
temperature, 7.5 ml of melted soft agar were added to the mixture and poured onto the 
surface of 150 mm nutrient agar plates and incubated overnight at 37°C. To collect the 
lysate, 20 ml of NB were added to each plate and the soft agar layer was collected by 

20 scrapping off with a clean microscope slide and shaken vigorously for 5 min to break 
up the agar. The mixture was then centrifuged for 10 min at 4,000 rpm (2,830 xg) 
using a JA-10 rotor (Beckman) and the supernatant (lysate) is collected and subjected 
to a treatment with 10 ug/ml of DNase I and RNase A for 30 min at 37°C. To 
precipitate the phage particles, 10% (w/v) of PEG 8000 and 0.5 M of NaCl were 

25 added to the lysate and the mixture was incubated on ice for 16 h. The phage was 
recovered by centrifugation at 4,000 rpm (3,500 xg) for 20 min at 4°C on a GS-6R 
table top centrifuge (Beckman). 

The pellet was resuspended with 2 ml of phage buffer (1 raM MgS0 4 , 5 mM 
MgCl 2 , 80 mM NaCl and 0.1% Gelatin). The phage suspension was extracted with 1 

30 volume of chloroform and further purified by centrifugation on a preformed cesium 
chloride step gradient as described in Sambrook et at. (1989), using a TLS 5S.r6Tor 
and centrifuged in an Optima TLX ultracentrifuge (Beckman) for 2 h at 28,000 rpm 
(67,000 xg) at 4°C. Banded phage was collected and ultracentrifuged again on an 
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isopycnic cesium chloride gradient (1.45 g/ml) at 40,000 rpm (64,000 x g) for 24 h at 
4°C using a TLV rotor (Beckman). The phage was harvested and dialyzed for 4 h at 
room temperature against 4 L of dialysis buffer consisting of 10 mM NaCl, 50 mM 
Tris-HCl [pH 8] and 10 mM MgCl 2 . Phage DNA was prepared from the phage 
5 suspension by adding 20 mM EDTA, 50 ug/ml Proteinase K and 0.5% SDS and 
incubating for 1 h at 65°C, followed by successive extractions with 1 volume of 
phenol, 1 volume of phenol-chloroform and 1 volume of chloroform. The DNA was 
then dialyzed overnight at 4°C against 4 L of TE (10 mM Tris-HCl [pH 8.0], ImM 
EDTA). 

10 

Example 8. DNA sequencing of the Bacteriophage 44 AHJD genome. 

Four mg of phage DNA was diluted in 200 pi of TE pH 8.0 in a 1 .5 ml 
eppendorf tube and sonication was performed (550 Sonic Dismembrator, Fisher 
Scientific). Samples were sonicated under an amplitude of 3 um with bursts of 5 s 

1 5 spaced by 1 5 s cooling in ice/water for 3 to 4 cycles and size fractionated on 1 % 
agarose gels. The sonicated DNA was then size fractionated by gel electrophoresis. 
Fractions ranging from 1 to 2 kbp were excised from the agarose gel and purified 
using a coommercial DNA extraction system according to the instructions of the 
manufacturer (Qiagen) and eluted in 50 ul of ImMTris-HCl [ pH 8.5]. 

20 The ends of the sonicated DNA fragments were repaired with a combination of 

T4 DNA polymearse and the Klenow fragment of E. coli DNA polymerase 1 as 
follows. Reactions were performed in a final volume of 100 ul containing DNA, 10 
mM Tris-HCl pH 8.0, 50 mM NaCl, 10 mM MgCl 2 , 1 mM DTT, 5 ug BSA, 100 uM 
of each dNTP and 1 5 units of T4 DNA polymerase (New England Biolabs) for 20 min 

25 at 12°C followed by addition of 12.5 units of Klenow fragment (New England 
Biolabs) for 15 min at room temperature. The reaction was stopped by two 
phenol/chloroform extractions and the DNA was ethanol precipitated and resuspended 
in 20 ul of HA 

Cloning of the sonicated phage DNA into pKSII vector and transformation: 
30 Blunt-ended DNA fragments were cloned by ligation directly into the-i/mcll - * 

site of the pkSII vector (Stratagene) dephosphorytated with calf intestinal alkaline 
phosphatase (New England Biolabs). A typical reaction contained 100 ng of vector, 2 
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to 5 jil of repaired sonicated phage DNA (50-100 ng) in a final volume of 20 \i\ 
containing 800 units of T4 DNA ligase (New England Biolabs) overnight at 16°C. 
Transformation and selection of positive clones was performed in the host strain 
DH10 p of E. coli using ampicillin as a selective antibiotic as described in Sambrook 
5 era/. (1989). 

Recombinant clones were picked from agar plates into 96-well plates 
containing 100 ml LB and 100 ug/ml ampicillin and incubated at 37°C. The presence 
of phage DNA insert was confirmed by PCR amplification using T3 and T7 primers 
flanking the Hindi cloning site of the pKS vector. PCR amplification of the potential 

1 0 foreign inserts was performed in a 1 5 ul reaction volume containing 10 mM Tris-HCl 
(pH 8.3), 50 mM KC1, 1.5 mM MgCI 2 , 0.02% gelatin, I mM primer, 187.5 uM each 
dNTP, and 0.75 units Tag polymerase (BRL). The thermocycling parameters were as 
follows: 2 min initial denaturation at 94*C for 2 min, followed by 20 cycles of 30 sec 
denaturation at 94°C, 30 sec annealing at 58C, and 2 min extension at 72°C, followed 

15 by a single extension step at 72°C for 10 min. Clones with insert sizes of 1 to 2 kbp 
were selected and plasmid DNA was prepared from the selected clones using the 
QIAprep™ spin miniprep kit (Qiagen). 

The nucleotide sequence of the extremities of each recombinant clone was determined 
using an ABI 377-36 automated sequencer with two types of chemistry: ABI prism 

20 BigDye™ primer cycle sequencing (2 1 Ml 3 primer: #403055)(M1 3REV primer: 
#403056) or ABI prism BigDye™ terminator cycle sequencing ready reaction kit 
(Applied Biosystems; #4303152). To ensure co-linearity of the sequence data and the 
genome, all regions of the phage genome were sequenced at least once from both 
directions on two separate clones. In areas that this criteria was not initially met, a 

25 sequencing primer was selected and phage DNA was used directly as sequencing 
template employing ABI prism BigDye™ terminator cycle sequencing ready reaction 
kit. 

Example 9. Bioinform atic management of primary nucleotide sequence. 
30 Sequence contigs were assembled using Sequencher™ 3.1 software 

(GeneCodes). To close contig gaps, sequencing primers were selected near the edge of 
the contigs. Phage DNA was used directly as sequencing template employing ABI 
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prism BigDye™ terminator cycle sequencing ready reaction kit (Applied Biosystems; 
#4303 152). The complete sequence of Staphylococcus aureus bacteriophage 44AHJD 
is shown in Table 16. 

A software program was used on the assembled sequence of bacteriophage 
5 44AHJD to identify all putative ORFs larger than 33 codons. The software scans the 
primary nucleotide sequence starting at nucleotide #1 for an appropriate start codon. 
Three possible selections can be made for defining the nature of the start codon; I) 
selection of ATG, II) selection of ATG or GTG, and III) selection of either ATG, 
GTG, TTG, CTG, ATT, ATC, and ATA. This latter initiation codon set corresponds 

10 to the one reported by the NCBIfhtro://www.ncbi .nlm.nih.gov/htbin. 

post/Taxonomv/wprintgc?mode a =c > ) for the bacterial genetic code. When an 
appropriate start codon is encountered, a counting mechanism is employed to count 
the number of codons (groups of three nucleotides) between this start codon and the 
next stop codon downstream of it. If a threshold value of 33 is reached, or exceeded, 

1 5 then the sequence encompassed by these two codons is defined as an ORF. This 

procedure is repeated, each time starting at the next nucleotide following the previous 
stop codon found, in order to identify all the other putative ORFs, The scan is 
performed on all three reading frames of both DNA strands of the phage sequence. 
The predicted ORFs for bacteriophage 44AHJD are listed in Tables 17 & 18. 

20 Sequence homology searches for each ORF were carried out using an 

implementation of blast programs. Downloaded public databases used for sequence 
analysis include: 

(i) non-redundant GenBank (ftp://ncbi.nlm.nih.gOv/blast/db/nr.Z), 
ii) Swissprot (ftp*y/ncbi.nlm.nih.gov/blast/db/swissprot.Z); 
25 iii) vector (ftpy/ncbijum.nih.gov/blast/db/vector.Z); 

iv) pdbaa databases (ftp://ncbi.nlm.nih.gOv/blast/db/pdbaa.Z); 

v) Staphylococcus aureus NCTC 8325 (ftp://ftp.genome.ou.edu/pub/staph/staph- 
tk.fe); 

vi) Sta/?Ay<TC0CCtt.s^ 1121 
30 97.Z); 

vii) PRODOM(ftp://ftp.toulouse.inra,fr/pub/prodom/current_release/pro 
astgz); 

viii) DOMO (ftp://ftp.infobiogen.fr/pub/db/domo/); 
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ix) TREMBL (ftp://www.expasy.ch/databases/sp_tr_nrdb/fasta/) 

The results of the homology searches performed on the ORFs of bacteriophage 
44AHJD are shown in Tables 19 & 20. 



5 Example jjft, Sub-Cloning of Pacteriophage 44 AHJD QB£& 

Expression preferably utilizes a shuttle expression vector which is arranged 
such that expression of the exogenous bacteriophage 44 AHJD ORF sequence is 
inducible. For example, the shuttle vector pT0021, in which the firefly luciferase 
(lucFF) expression is controlled by the ars (arsenite) promoter/operator (Tauriainen et 

10 al., 1997), can be modified in the following fashion. Two oligonucleotides 

corresponding to a short antigenic peptide derived from the heamaglutinin protein of 
influenza virus (HA epitope tag) were synthesized (Field et al., 1988). The sense 
strand HA tag sequence (with BamHl, Sail and HinaVl cloning sites) is: 
S'-gatcccggtcgaccaagcttTACCCATACGACGTCCCAGACTACGCCAGCTGA-S* 

1 5 (where upper case letters denote the nucleotide sequence of the HA tag); the antisense 
strand HA tag sequence (with a HindUL cloning site) is: 

5 1 -agctTCAGCTGGCGTAGTCTGGGACGTCGTATGGGTAaagcttggtcgaccgg-3 * 
(where upper case letters denote the sequence of the HA tag). The two HA tag 
oligonucleotides were annealed and ligated into pT0021 vector which had been 

20 digested with BamHl and HindUL. This manipulation resulted in replacement of the 
lucFF gene by the HA tag. This modified shuttle vector containing the arsenite 
inducible promoter, the arsR gene, and HA tag was named pTHA. A diagram 
outlining our modification of pT0021 to generate pTHA is shown in Fig. 1 A (another 
userful vector construct is shown in Fig. IB). 

25 Each ORF, encoded by Bacteriophage 44 AHJD, larger than 33 amino acids 

and having a Sbine-Dalgamo sequence upstream of the initiation codon can be 
selected for functional analysis for bacterial inhibition. Each individual ORF, from 
initiation codon to last codon (excluding the stop codon), can be amplified from phage 
genomic DNA using the polymerase chain reaction (PCR). For PCR amplification of 

30 ORFs, each sense strand primer targets the initiation codon and is preceded by a 

BamHl restriction site ( y cgggatcc 3 ) and each antisense oligonucleotide targets tjje"*- * 
pentultimate codon (the one before the stop codon) of the ORF and is preceded by a 
Sal I restriction site ( 5 'gc gtcgac cg y ). The PCR product of each ORF can be gel 
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purified and digested with BamHl and Sail. The digested PCR product can then be 
gel purified using the Qiagen kit as described, Hgated into BamHl and Sail digested 
pTHA vector, and used to transform £. coli bacterial strain DH10P(as described 
above). As a result of this manipulation, the HA tag is set inframe with the ORF and is 
5 positioned at the carboxy terminus of each ORF (pTHA/ORF clones). Recombinant 
pTHA/ORF clones will be picked and their insert sizes were confirmed by PCR 
analysis using primers flanking the cloning site. The following primers can be used 
for PCR amplification: HAF: 5 T ATT ATCC AAAACTTGAAC A 3 ' ; HAR: 
S CGGTGGTATATCCAGTGATT 3 '. The sequence integrity of cloned ORFs can be 

1 0 verified directly by DNA sequencing using primers HAF and HAR. In cases where 
verification of ORF sequence can not be achieved by one pass with the sequencing 
primers, additional internal primers will be selected and used for sequencing. 

Staphylococcus aureus strain RN4220 (Kreiswirth et aL, 1983) will be used as 
a recipient for the expression of recombinant plasmids. Electoporation will be 

1 5 performed essentially as previously described (Schenk and Laddaga, 1992). Selection 
of recombinant clones will be performed on Luria-Broth agar (LB-agar) plates 
containing 30 u£/ml of kanamycin. 

Alternatively, a constitutive promoter can be used to drive expression of the 
introduced ORF, and compare cell growth to control bacterial cells containing the 

20 parental vector lacking any introduced phage ORF. Recombinant plasmids will be 
introduced into Staphylococcus aureus strain RN4220 (Kreiswirth et al., 1983) using 
electoporation as previously described (Schenk and Laddaga, 1992). 
Cloning of ORFs with a Shine-Dalgarno sequence 

ORFs with a Shine-Dalgamo sequence are selected for functional analysis of 

25 bacterial killing. Each ORF, from initiation codon to last codon (excluding the stop 
codon), can be amplified by PCR from phage genomic DNA. For PCR amplification 
of ORFs, each sense strand primer starts at the initiation codon and is preceded by a 
restriction site and each antisense strand starts at the last codon (excluding the stop 
codon) and is preceded by a different restriction site. The PCR product of each ORF 

30 will be gel purified and digested with the restriction enzymes with sites contained on 
the PCR oligonucleotides. The digested PCR product is then gel purified using4he * 
Qiagen kit, ligated into the modified shuttle vector, and used to transform bacterial 
strain DH10. Recombinant clones are then picked and their insert sizes confirmed by 
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PCR analysis using primers flanking the cloning site as well as restriction digestion. 
The sequence fidelity of cloned ORFs can be verified by DNA sequencing using the 
same primers as used for PCR. In the cases that the verification of ORFs can not be 
achieved by one path of sequencing using primers flanking the cloning site internal 
5 primers can be selected and used for sequencing. Recombinant plasmids can be 
introduced into Staphylococcus aureus strain RN4220 (Kreiswirth et ah, 1983) using 
electoporation as previously described (Schenk and Laddaga, 1992). 
Induction of gene expression from the ars promoter. 

If an inducible promoter is used, e.g., the ars promoter, induction can be 
1 0 assessed, for example, in either of the two methods. 
1 T Screening on agar plates 

The functional identification of killer ORFs can be performed by spreading an aliquot 
of 5. aureus transformed cells containing phage 44 AHJD ORFs onto agar plates 
containing different concentrations of sodium arsenite (0; 2.5; 5; and 7.5 uM), The 

1 5 plates are incubated overnight at 37°C, after which a growth inhibition of the ORF 
transformants on plates that contain arsenite are compared to plates without arsenite. 
2. Quantification of growth inhibition in liquid medium 

Cells containing different recombinant plasmids can be grown for overnight at 
37°C in LB medium supplemented with the appropriate antibiotic selection. These are 

20 then diluted to the mid log phase (OD 540 ~ 2) with fresh media containing antibiotic 
and transferred to 96-well microtitration plates (100 ul/well). Inducer is then added at 
different final concentrations (ranging from 2.5 to 10 juM) and the culture incubated 
for an additional 2 hrs at 37°C. The effect of expression of the phage 44 AHJD ORFs 
on bacterial cell growth is then monitored by measuring the OD J40 and comparing the 

25 rate of growth to the culture not containing inducer. [As positive controls for growth 
inhibition, the hlA gene of phage lambda (Reisinger, GR., Rietsch, A., Lubitz, W, and 
Biasi, U. 1993 Virology #193: 1033-1036), and the holin/lysin genes of the 
Staphylococcus aureus phage Twort (Loessner, MJ., Gaeng, S., Wendlinger, G., 
Maier, SK. and Scherer, S. 1998. FEMS Microbiology Letters #162:265-274) can be 

30 subcloned into the ars inducible vector. An aliquot of the induced and uninduced . 
culture can also be plated out on agar plates containing an appropriate antibiotic- 
selection but lacking inducer. Following incubation overnight at 37°C, the number of 
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colonies is counted. Any ORF showing bacteriostatic activity will show a lower, but 
detectable, number of colonies on the agar plates when grown in the presence of 
inducer as compared to when grown in the absence of inducer. Any ORF showing 
full bacteriocidal activity will show no colonies on the agar plates, when grown in the 
5 presence of inducer as compared to when grown in the absence of inducer. 
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Example 1 1, Growth of Enterococcus bacteriophage 1 82 and purification of 

genomic PNA- 

The Enterococcus propagating strain (PS) {Enterococcus sp. Group D, Felix 
d'Herelle Reference Centre #HER 1080) was used as host to propagate its respective 

10 phage 182 (Felix d'Herelle Reference Centre #HER 80). Two rounds of plaque 
purification of phage 1 82 were performed on soft agar essentially as described in 
Sambrook et al. (1989). Briefly, the Enterococcus sp. PS strain was grown overnight 
at 37°C in Tryptic Soy Broth [TSB: 17 g Bacto tryptone, 3 g Bacto soytone, 2.5 g 
Bacto dextrose, 5 g Sodium chloride, and 2.5 g Dipotassium phosphate per liter 

15 (Difco Laboratories (#0370-17-3)]. The culture was then diluted 20 fold in TSB and 
incubated at 37°C until the OD^* 0.2 (early log phase) with constant agitation. In 
order to obtain single plaques, phage 182 was subjected to 10 fold serial dilutions 
using the phage buffer (1 mM MgS0 4 , 5 mM MgCl 2 , 80 mM NaCl and 0.1% Gelatin 
(w/v)) and 10 1 of each dilution was used to infect 0.5 ml of the bacterial cell 

20 suspension. After incubation at 1 5 min at 37°C, 2 ml of melted soft agar (TSB 

supplemented with 0.6% agar) was added to the mixture and poured onto the surface 
of 100 mm Trytic Soy Agar plates [TSA: 15 g Tryptone peptone, 5 g Soytone 
peptone, 5 g Sodium chloride and 15 g of Agar per liter (Difco Laboratories #0369- 
17)]. After overnight incubation at 37°C, a single plaque was isolated, resuspended in 

25 1 ml of phage buffer by end over end rotation for 2 hrs at room temperature, and the 
phage suspension was diluted and used for a second infection as described above. 
After overnight incubation at 37°C, a single plaque was isolated and used as a stock 
for all subsequent manipulations. 

The propagation procedure for bacteriophage 182 was modified from the agar 

30 layer method of SwanstGrm and Adams (1951). Briefly, the Enterococcus sp. PS 

strain was grown to stationary phase overnight at 37°C in TSB. The culture was then - - 
diluted 20 fold in TSB and incubated at 37°C until the A 54Q = 0.2. The suspension 
(15xl0 7 Bacteria) was then mixed with 15x10 s plaque forming units (pfu) to give a 
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ratio of 100-bacteria/pfu. After incubation of 15 min at 37°C, 7.5 ml of melted soft 
agar (TSB plus 0.6% agar) were added to the mixture and poured onto the surface of 
1 50 mm TSA plates and incubated 1 6 hrs at 37°C. To collect the plate lysate, 20 ml 
of TSB were added to each plate and the soft agar layer was collected by scrapping off 
5 with a clean microscope slide followed by vigorous shaking of the agar suspension for 
5 min to break up the agar. The mixture was then centrifuged for 10 rain at 4,000 rpm 
(2,830 xg) using a JA-10 rotor (Beckman) and the supernatant fluid (lysate) is 
collected and subjected to a treatment with 10 fig /ml of DNase I and RNase A for 30 
min at 37°C. To precipitate the phage particles, the phage suspension was adjusted to 

10 10% (w/v) of PEG 8000 and 0.5 M of NaCl followed by incubation at 4°C for 16 hrs. 
The phage was recovered by centrifugation at 4,000 rpm (3,500 xg) for 20 min at 4°C 
on a GS-6R table top centrifuge (Beckman). The pellet was resuspended with 2 ml of 
phage buffer (1 mM MgS0 4 , 5 mM MgCl 2 , 80 mM NaCl and 0. 1% Gelatin). The 
phage suspension was extracted with 1 volume of chloroform and further purified by 

1 5 centrifugation on a cesium chloride step gradient as described in Sambrook et ai 
(1989), using a TLS 55 rotor and centrifuged in an Optima TLX ultracentrifuge 
(Beckman) for 2 hrs at 28,000 rpm (67,000 xg) at 4°C. Banded phage was collected 
and ultracentrifuged again on an isopycnic cesium chloride gradient (1.45 g/ml) at 
40,000 rpm (64,000 xg) for 24 hrs at 4°C using a TLV rotor (Beckman). The phages 

20 were harvested and dialyzed for 4 hrs at room temperature against 4 L of dialysis 
buffer consisting of 1 0 mM NaCl, 50 mM Tris-HCl [pH 8] and 1 0 mM MgCl 2 . Phage 
DNA was prepared from the phage suspension by adding 20 mM EDTA, 50 g/ml 
Proteinase K. and 0.5% SDS and incubating for 1 hr at 65°C, followed by successive 
extractions with 1 volume of phenol, 1 volume of phenol-chloroform and 1 volume of 

25 chloroform. The DNA was then dialyzed overnight at 4°C against 4 L of TE (1 0 mM 
Tris-HCl [pH 8.0], ImM EDTA). 

Example \2 t DNA sequencing of the Bacteriophage 1 ?2 genome, 

Four micrograms of phage DNA was diluted in 200 ul of TE (10 mM Tris, 
30 [pH 8.0], 1 mM EDTA) in a 1 .5 ml eppendorf tube and sonication was performed 
(550 Sonic Dismembrator, Fisher Scientific). Samples were sonicated under an, 
amplitude of 3 um with bursts of 5 s spaced by 1 5 s cooling in ice/water for 3 to 4 
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cycles. The sonicated DNA was then size fractionated by electrophoresis on 1% 
agarose gels utilizing TAE (1 x TAE is: 40 mM Tris-acetate, 1 mM EDTA [pH 8.0]) 
as the running buffer. Fractions ranging from 1 to 2 kbp were excised from the 
agarose gel and purified using a commercial DNA extraction system according to the 
5 instructions of the manufacturer (Qiagen), with a final elution of 50 \ii of 1 mM Tris 
[pH8.5]. 

The ends of the sonicated DNA fragments were repaired with a combination of 
T4 DNA polymerase and the Klenow fragment of E. coli DNA polymerase I, as 
follows. Reactions were performed in a reaction mixture (final volume, 100 u.1) 

1 0 containing sonicated phage DNA, 1 0 mM Tris-HCl [pH 8.0], 50 mM NaCl, 1 0 mM 
MgCI 2 , 1 mM DTT, 50 ug/ml BSA, 100 uM of each dNTP and 15 units of T4 DNA 
polymerase (New England Biolabs) for 20 min at 12°C followed by addition of 12.5 
units of the Klenow large fragment of DNA polymerase I (New England Biolabs) for 
15 min at room temperature. The reaction was stopped by two phenol/chloroform 

1 5 extractions and the DNA was precipitated with ethanol and the final DNA pellet 
resuspended in 20 jil of H,0. 

Blunt-ended DNA fragments were cloned by ligation directly into IheHinc U 
site of the pKSH+ vector (New England Biolabs) dephosphorylated by treatment with 
calf intestinal alkaline phosphatase (New England Biolabs). A typical ligation reaction 

20 contained 100 ng of vector DNA, 2 to 5 ul of repaired sonicated phage DNA (50-100 
ng) in a final volume of 20 uJ containing 800 units of T4 DNA ligase (New England 
Biolabs) and was incubated overnight at 16°C. Transformation and selection of 
bacterial clones containing recombinant plasmids was performed in E. coli DH10P 
according to standard procedures (Sambrook et al, 1989). 

25 Recombinant clones were picked from agar plates into 96-weIl plates 

containing 100 ul LB and 100 u.g/ml ampicillin and incubated at 37°C. The presence 
of phage DNA insert was confirmed by PCR amplification using T3 and T7 primers 
flanking the Hinc II cloning site of the pKS vector. PCR amplification of the potential 
foreign inserts was performed in a 15 \i\ reaction volume containing 10 mM Tris (pH 

30 8.3), 50 mM KG, 1.5 mM MgCl 2 , 0.02% gelatin, 1 uM primer, 187.5 uJM each dNTP* 
and 0.75 units Taq polymerase (BRL). The thermocycling parameters were as 
follows: 2 min initial denaturation at 94°C for 2 min, followed by 20 cycles of 30 sec 
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denaturation at 94°C, 30 sec annealing at 58°C, and 2 min extension at 72°C, 
followed by a single extension step at 72°C for 10 min. Clones with insert sizes of I 
to 2 kbp were selected and plasmid DNA was prepared from the selected clones using 
the QIAprep™ spin miniprep kit (Qiagen). 
5 The nucleotide sequence of the extremities of each recombinant clone was 

determined using an ABI 377-36 automated sequencer with two types of chemistry: 
ABI prism Big Dye™ primer cycle sequencing (21M13 primer: #403055)(M13REV 
primer; #403056) or ABI prism Big Dye™ terminator cycle sequencing ready reaction 
kit (Applied Biosystems; #4303152). To ensure co-linearity of the sequence data and 
10 the genome, all regions of the phage genome were sequenced at least once from both 
directions on two separate clones. In areas that this criteria was not initially met, a 
sequencing primer was selected and phage DNA was used directly as sequencing 
template employing ABI prism BigDye™ terminator cycle sequencing ready reaction 
kit. 

15 

Example 13. Bioinformaric management of primary nucleotide sequence. 

Sequence contigs were assembled using Sequencher™ 3.1 software 
(GeneCodes). To close contig gaps, sequencing primers were selected near the edge of 
the contigs. Phage DNA was used directly as sequencing template employing ABI 
20 prism BigDye™ terminator cycle sequencing ready reaction kit (Applied Biosystems; 
#4303152). The complete sequence of Enterococcus bacteriophage 182 is shown in 
Table 21. 

A software program was used on the assembled sequence of bacteriophage 182 
to identify all putative ORFs larger than 33 codons. The software scans the primary 

25 nucleotide sequence starting at nucleotide #1 for an appropriate start codon. Three 
possible selections can be made for defining the nature of the start codon; I) selection 
of ATG, II) selection of ATG or GTG, and HI) selection of either ATG, GTG, TTG, 
CTG, ATT, ATC, and ATA. This latter initiation codon set corresponds to the one 
reported by the NCBI (http://www.ncbi.nlm.nih.gov/htbin- 

30 PQSt/Taxonomv/wprintgc?rnode=c) for the bacterial genetic code. When an 

appropriate start codon is encountered, a counting mechanism is employed to count" 
the number of codons (groups of three nucleotides) between this start codon and the 
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next stop codon downstream of it. If a threshold value of 33 is reached, or exceeded, 
then the sequence encompassed by these two codons is defined as an ORF. This 
procedure is repeated, each time starting at the next nucleotide following the previous 
stop codon found, in order to identify all the other putative ORFs. The scan is 
5 performed on all three reading frames of both DNA strands of the phage sequence. 
The predicted ORFs for bacteriophage 1 82 are listed in Tables 22 & 23. 
Sequence homology searches for each ORF were carried out using an implementation 
of BLAST programs. Downloaded public databases used for sequence analysis 
include: 

1 0 (i) non-redundant GenBank (ftp://ncbi.nlm.nih.gOv/blast/db/nr.Z), 

ii) Swissprot (ftp://ncbi.nlm.nih.gOv/blast/db/swissprot.2); 

iii) vector (ftp://ncbi.nlm.nih.gOv/blast/db/vector.Z); 

iv) pdbaa databases (ftp://ncbi.nlm.nih.gOv/blast/db/pdbaa.Z); 

v) staphylococcus aureus NCTC 8325 (ftp://ftp.genome.ou.edu/pub/staph/staph- 
15 lk.fa); 

vi) streptococcus pyrogenes 

(ftp^/ftp.tigr.org/pub/data/s_pneumoniae/gsp.contigs. 1 1 2 1 97.Z); 

vii) PRODOM 

f ftp://r^.toulouse.inra.fr/nub/nrodom/current release/prodom99. 1 .forblast.gz>: 
20 viii) DOMO (ft p://frp.infohiogen.fr/pub /db/domo^: 

ix) TREMBL (ftp://www.expasy.ch/databases/sp_tr_nrdb/fasta/) 

The results of the homology searches performed on the ORFs of bacteriophage 
182 are shown in Tables 24 & 26. 

25 Example 14. Sub-Cloning of Bacteriophage 182 ORFs, 
Preparation of the shuttle expression vector 

Expression preferably utilizes a shuttle expression vector which is arranged 
such that expression of the exogenous bacteriophage 182 ORF sequence is inducible. 
For example, the plasmid pND50 replicates in E. coli t E.faecalis, and S. aureus 
30 (Yamagishi, J., Kojima, T., Oyamada, Y., Fujimoto, K., Hattori, H., Nakamura, S., 
and Inoue, M. 1996. Antimocrob. Agents Chemother. 40, 1 157-1 163). This plasmid- 
can be modified by conventional techniques to insert the inducible arsenite promoter, 
derived from the shuttle vector pT0021, in which the firefly Iuciferase (lucFF) 
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expression is controlled by the ars promoter/operator from a S. aureus plasmid 
(Tauriainen, S., Karp, M., Chang, W and Virta, M. (1997). Recombinant luminescent 
bacteria for measuring bioavailable arsenite and antimonite. Appl. Environ. Microbiol. 
63:4456-4461). This modified shuttle vector will contain the ars promoter, arsR gene 

5 and a cloning site for introduction of individual phage ORFs downstream from a 
shine-delgarno sequence. 

Other inducible regulatory sequences can be utilized instead of the arsenite- 
inducible system. An example is a nisin-inducible system The nisA promoter activity 
is dependent on the proteins NisR and NisK, which constitute a two-component signal 

10 transduction system that responds to the extracellular inducer nisin. The nisin 

sensitivity and inducer concentration required for maximal induction varies among the 
strains, but is functional in Streptococcus pyogenes, Streptococcus agalactiae, 
Streptococcus pneumoniae y Enterococcus faecalis, and Bacillus subtilis. Significant 
induction of the nisA promoter (10- to 60-fold induction) can be obtained in all of the 

15 species. A vector containing this promoter was published as Eichenbaum Z, Federle 
MJ, Marra D, de Vos WM, Kuipers OP, Kleerebezem M, and Scott JR (1998) Appl 
Environ Microbiol 64, 2763-2769. Other vectors, e.g., plasmids, can also be utilized 
which will allow replication and transciption in Enterococcus. 

Alternatively, a constitutive promoter can be used (e.g„ the p-lactamase 

20 promoter is constitutive in E. faecalis - see ref. 1 ) to drive expression of the 

introduced ORF, and compare cell growth to control bacterial cells containing the 
parental vector lacking any introduced phage ORF. Recombinant plasmids are 
introduced into E. faecalis strain FA2-2 by electroporation, as previously described 
(Yamagishi, J., Kojima, T., Oyaraada, Y., Fujimoto, K., Hattori, H., Nakamura, S., 

25 and Inoue, M. 1996. Antimicrob. Agents Chemother. 40, 1 157-1 163). 
Cloning of ORFs with a Shine-Dalgarno sequence 

ORFs with a Shine-Dalgamo sequence are selected for functional analysis of 
bacterial killing. Each ORF, from initiation codon to last codon (excluding the stop 
codon), will be amplified by PCR from phage genomic DNA. For PCR amplification 

30 of ORFs, each sense strand primer starts at the initiation codon and is preceded by a 
restriction site and each antisense strand starts at the last codon (excluding the slop— 
codon) and is preceded by a different restriction site. The PCR product of each ORF 
will be gel purified and digested with the restriction enzymes with sites contained on 
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the PCR oligonucleotides. The digested PCR product is then gel purified using the 
Qiagen kit, ligated into the modified shuttle vector, and used to transform bacterial 
strain DHIOp. Recombinant clones are then picked and their insert sizes confirmed by 
PCR analysis using primers flanking the cloning site as well as restriction digestion. 
5 The sequence fidelity of cloned ORFs will be verified by DNA sequencing using the 
same primers as used for PCR. In the cases that the verification of ORFs can not be 
achieved by one path of sequencing using primers flanking the cloning site internal 
primers will be selected and used for sequencing. Recombinant plasmids will be 
introduced into E. faecalis strain FA2-2 by electroporation, as previously described 

1 0 (Yamagishi, J., Kojima, T., Oyamada, Y., Fujimoto, K., Hattori, H., Nakamura, S., 
and Inoue, M. 1996. Antimicrob. Agents Chemother. 40, 1157-1163). 
Induction of gene expression from the ars promoter. 

If an inducible promoter is used, e.g., the ars promoter, induction can be 
assessed, for example, in either of the two methods. 

15 |.3creenmgon agar plates 

The functional identification of killer ORFs can be performed by spreading an aliquot 
of E. faecalis transformed cells containing phage 182 ORF onto agar plates containing 
different concentrations of sodium arsenite (0; 2.5; 5; and 7.5 uM). The plates are 
incubated overnight at 37°C, after which a growth inhibition of the ORF 

20 transform ants on plates that contain arsenite are compared to plates without arsenite. 
2. Quantification of growth inhibition in liquid medium 

Cells containing different recombinant plasmids can be grown for overnight at 
37°C in LB medium supplemented with the appropriate antibiotic selection. These are 
then diluted to the mid log phase (OD^.2) with fresh media containing antibiotic 

25 and transferred to 96-well microtitration plates (100 unwell). Inducer is then added at 
different final concentrations (ranging from 2.5 to 10 uM) and the culture incubated 
for an additional 2 h at 37°C. The effect of expression of the phage 182 ORFs on 
bacterial cell growth is then monitored by measuring the OD^ and comparing the rate 
of growth to the culture not containing inducer. As positive controls for growth 

30 inhibition, the kilA gene of phage lambda (Reisinger, GR., Rietsch, A., Lubitz, and . , 
Blasi, U. 1993 Virology #193; 1033-1036), and the holin/lysin genes T>f the 
Staphylococcus aureus phage Twort (Loessner, MJ., Gaeng, S., Wendlinger, G., 
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Maier, SK. and Scherer, S. 1998. FEMS Microbiology Letters #162:265-274) were 
subcloned into the ars inducible vector. An aliquot of the induced and uninduced 
culture can also be plated out on agar plates containing an appropriate antibiotic 
selection but lacking inducer. Following incubation overnight at 37°C, the number of 
5 colonies is counted. Any ORF showing bacteriostatic activity will show a lower, but 
detectable, number of colonies on the agar plates when grown in the presence of 
inducer as compared to when grown in the absence of inducer. Any ORF showing 
bacteriocidal activity will show no colonies on the agar plates, when grown in the 
presence of inducer as compared to when grown in the absence of inducer. 

10 
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Example 15. Growth of Streptococcus bacteriophage Dr>l and purification of 
genomic DNA- 

The Streptococcus pneumoniae R6 propagating strain (PS) (Tomasz, 
1966) was used as host to propagate its respective phage Dp-1 (McDonnell et al., 

25 1 975). (Alternatively, Streptococcus (Diplococcus) pneumoniae R36A could be used. 
Strain R36A is available from ATCC as #1 1733 or 27336. Streptococcus pneumoniae 
is also available from Felix d'Herelle Reference Center in Quebec, Canada as catalog 
number HER 1054. Other S. pneumoniae strains are also available from ATCC.) 
Two rounds of plaque purification of phage Dp-1 were performed on soft agar 

30 essentially as described in Sambrook et al (1989). Briefly, the Streptococcus R6 PS 
strain was grown overnight at 37°C in K-Cat media [K-Cat;. 1 0 g Bacto casitone, 5 g 
Bacto tryptone, 1 g Yeast extract, 5g Potassium chloride, 0.2% Glucose, SOmM' 
Potassium phosphate buffer [pH 8] and 250,000 Units Catalase per liter (Boehringer 
Mannheim #10683600). The culture was then diluted 20 fold in K-CAT and 
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incubated at 37°C until the 0D S40 ^ 0.2 (early log phase) with constant agitation. In 
order to obtain single plaques, Dp-1 phage was subjected to 10-fold serial dilutions 
using the phage buffer (100 mM Tris-HCl [pH 7.5], 100 mM NaCl and 10 mM 
MgCl 2 )and 10 ul of each dilution was used to infect 0.5 ml of the cell suspension. 
5 After incubation of 1 5 min at 37°C, 2 ml of melted soft agar (K-CAT supplemented 
with 0.8% of agar) were added to the mixture and poured onto the surface of 100 mm 
K-CAT agar plates [K-CAT supplemented with 1.2 % of agar]. After solidification of 
the soft agar layer, an additional 5 ml of melted soft agar was added to visualize 
distinct plaques (Ronda et al M 1978). After overnight incubation at 37°C, a single 

1 0 plaque was isolated, resuspended in 1 ml of phage buffer by end over end rotation for 
2 hrs at room temperature, and the phage suspension was diluted and used for a 
second infection as described above. After overnight incubation at 37°C, a single 
plaque was isolated and used as a stock for all subsequent manipulations. 

The propagation procedure for bacteriophage Dp-1 was modified from the 

1 5 agar layer method of Swanstfirm and Adams (195 1). Briefly, the R6 strain of 

Streptococcus pneumoniae was grown to stationary phase overnight at 37°C in K- 
CAT. The culture was then diluted 20 fold in K-CAT and incubated at 37°C until the 
OD S40 - 0.2. The suspension (15xl0 7 Bacteria) was then mixed with 15x10* plaque 
forming units (pfu) to give a ratio of 100-bacteria/pfu. After incubation of 15 min at 

20 37°C, 7.5 ml of melted soft agar (K-CAT plus 0.8% agar) were added to the mixture 
and poured onto the surface of 150 mm K-CAT agar plates and incubated 16 hrs at 
37°C. After solidification of the soft agar layer, 7.5 ml of melted soft agar were added 
to each plate. To collect the plate lysate, 20 ml of K-CAT media were added to each 
plate and the soft agar layers were collected by scrapping off with a clean microscope 

25 slide followed by vigorous shaking of the agar suspension for 5 min to break up the 
agar. The mixture was then centrifuged for 10 min at 4,000 rpm (2,830 xg) using a 
JA-10 rotor (Beckman) and the supernatant (lysate) was collected and subjected to a 
treatment with 10 fig /ml of DNase I and RNase A for 30 min at 37°C. To precipitate 
the phage particles, the phage suspension was adjusted to 10% (w/v) of PEG 8000 and 

30 0.5 M of NaCl followed by incubation at 4°C for 1 6 hrs. The phage was recovered by 
centrifugation at 4,000 rpm (3,500 xg) for 20 min at 4°C on a GS-6R table top 
centrifuge (Beckman). The pellet was resuspended with 2 ml of phage buffer (100 
mM Tris-HCl [pH 7.5], 100 mM NaCl and 10 mM MgCy. The phage suspension 
was extracted with 1 volume of chloroform and further purified by centrifugation on a 

35 cesium chloride step gradient as described in Sambrook et ai. (1989), using a TLS-55 
rotor and centrifuged in an Optima TLX ultracentrifuge (Beckman) for 2 hrs at 28,000 
rpm (67,000 xg) at 4°C. Banded phage was collected and ultracentrifuged again on an 
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isopycnic cesium chloride gradient (1 .45 g/ml) at 40,000 rpm (64,000 xg) for 24 hrs at 
4°C using a TLV rotor (Beckman). The phage was harvested and dialyzed for 4 hrs at 
room temperature against 4 L of dialysis buffer consisting of 10 mM NaCl, 50 mM 
Tris-HCl [pH 8] and 10 mM MgCl 2 . Phage DNA was prepared from the phage 
5 suspension by adding 20 mM EDTA, 50 ug/ml Proteinase K and 0.5% SDS and 
incubating for 1 hr at 65°C, followed by successive extractions with 1 volume of 
phenol, 1 volume of phenol-chloroform and 1 volume of chloroform. The DNA was 
then dialyzed overnight at 4°C against 4 L of TE (10 mM Tris-HCl [pH 8.0], ImM 
EDTA). 

10 

Example 16. DNA sequencing of the Bacteriophage Dd-1 genome. 

Four micrograms of phage DNA was diluted in 200 ul of TE (10 mM Tris, 
[pH 8.0], 1 mM EDTA) in a 1 .5 ml eppendorf tube and sonication was performed 
(550 Sonic Dismembrator, Fisher Scientific). Samples were sonicated under an 

1 5 amplitude of 3 urn with bursts of 5 sec spaced by 1 5 sec cooling in ice/water for 3 to 4 
cycles. The sonicated DNA was then size fractionated by electrophoresis on 1% 
agarose gels utilizing TAE (1 x TAE is: 40 mM Tris-acetate, 1 mM EDTA [pH 8.0]) 
as the running buffer. Fractions ranging from 1 to 2 kbp were excised from the 
agarose gel and purified using a commercial DNA extraction system according to the 

20 instructions of the manufacturer (Qiagen), with a final elution of 50 uJ of 1 mM Tris 
[pH 8.5]. 

The ends of the sonicated DNA fragments were repaired with a combination of 
T4 DNA polymerase and the Klenow fragment of £. coli DNA polymerase I, as 
follows. Reactions were performed in a reaction mixture (final volume, 100 ul) 

25 containing sonicated phage DNA, 10 mM Tris-HCl [pH 8.0], 50 mM NaCI, 10 mM 
MgCl 2 , 1 mM DTT, 50 ug/ml BSA, 100 uM of each dNTP and 15 units of T4 DNA 
polymerase (New England Biolabs) for 20 min at 12°C followed by addition of 1 2.5 
units of the Klenow large fragment of DNA polymerase I (New England Biolabs) for 
1 5 min at room temperature. The reaction was stopped by two phenol/chloroform 

30 extractions and the DNA was precipitated with ethanol and the final DNA pellet 
resuspended in 20 yl of H 2 0. 

Blunt-ended DNA fragments were cloned by ligation directly into the Hinc II 
site of the pKSII+ vector (New England Biolabs) dephosphorylated by treatment with 
calf intestinal alkaline phosphatase (New England Biolabs). A typical ligation 

35 reaction contained 100 ng of vector DNA, 2 to 5 jil of repaired sonicated phage DNA 
(50-100 ng) in a final volume of 20 ul containing 800 units of T4 DNA Hgase (New 
England Biolabs) and was incubated overnight at 16°C Transformation and selection 
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of bacterial clones containing recombinant plasmids was performed in E. coli DHlOp 
according to standard procedures (Sambrook et ai, 1989). 

Recombinant clones were picked from agar plates into 96-well plates 
containing 100 til LB and 100 ug/ml ampiciliin and incubated at 37°C. The presence 
5 of phage DNA insert was confirmed by PCR amplification using T3 and T7 primers 
flanking the Hinc II cloning site of the pKS vector. PCR amplification of the potential 
foreign inserts was performed in a 15 fil reaction volume containing 10 mM Tris (pH 
8.3), 50 mM KCI, 1.5 mM MgCl 2 , 0.02% gelatin, 1 \xM primer, 187.5 \xU each dNTP, 
and 0.75 units Taq polymerase (BRL). The thermocycling parameters were as 
1 0 follows: 2 min initial denaturation at 94°C for 2 min, followed by 20 cycles of 30 sec 
denaturation at 94°C, 30 sec annealing at 58°C, and 2 min extension at 72°C, 
followed by a single extension step at 72°C for 10 min. Clones with insert sizes of 1 
to 2 kbp were selected and plasmid DNA was prepared from the selected clones using 
the QIAprep™ spin miniprep kit (Qiagen). 
1 5 The nucleotide sequence of the extremities of each recombinant clone was 

determined using an ABI 377-36 automated sequencer with two types of chemistry: 
ABI prism Big Dye™ primer cycle sequencing (21M13 primer: #403055)(M13REV 
primer: #403056) or ABI prism Big Dye™ terminator cycle sequencing ready reaction 
kit (Applied Biosystems; #4303152). To ensure co-linearity of the sequence data and 
20 the genome, all regions of the phage genome were sequenced at least once from both 
directions on two separate clones. In areas that this criteria was not initially met, a 
sequencing primer was selected and phage DNA was used directly as sequencing 
template employing ABI prism Big Dye™ terminator cycle sequencing ready reaction 
kit. 

25 

Sxqmple \J t Pjojnformatic management of primary nucleotide sequence. 

Sequence contigs were assembled using Sequencher™ 3.1 software 
(GeneCodes). To close contig gaps, sequencing primers were selected near the edge 
of the contigs. Phage DNA was used directly as sequencing template employing ABI 
30 prism BigDye™ terminator cycle sequencing ready reaction kit (Applied Biosystems; 
#4303152). The complete sequence of Streptococcus bacteriophage Dp-1 is shown in 
Table 28. 

A software program was used on the assembled sequence of bacteriophage 
Dp-1 to identify all putative ORFs larger than 33 codons. The software scans the 
35 primary nucleotide sequence starting at nucleotide # 1 for an appropriate start colour ~ 
Three possible selections can be made for defining the nature of the start codon; I) 
selection of ATG, II) selection of ATG or GTG, and HI) selection of either ATG, 
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GTG, TTG, CTG, ATT, ATC, and ATA. This latter initiation codon set corresponds 
to the one reported by the NCBI ( r http://www ( ncbi.nlm.nih.gov/htbin- 
post/Taxonomv/wprintgc?mode=d for the bacterial genetic code. When an 
appropriate start codon is encountered, a counting mechanism is employed to count 
5 the number of codons (groups of three nucleotides) between this start codon and the 
next stop codon downstream of it. If a threshold value of 33 is reached, or exceeded, 
then the sequence encompassed by these two codons is defined as an ORF. This 
procedure is repeated, each time starting at the next nucleotide following the previous 
stop codon found, in order to identify all the other putative ORFs. The scan is 
10 performed on all three reading frames of both DNA strands of the phage sequence. 
The predicted ORFs for bacteriophage Dp-1 are listed in Tables 29 and 30, and Fig. 6. 

Sequence homology searches for each ORF were carried out using an 
implementation of BLAST programs. Downloaded public databases used for 
sequence analysis include; 
1 5 (i) non-redundant GenBank (ftp://ncbi.nlm.nih.goV/blast/db/nr.Z), 

ii) Swissprot (ftp^/ncbi.nlm.nih.gov/blast/db/swissprot.Z); 

in) vector (ftp://ncbi.nlm.nih.g0v/biast/db/vect0r.Z); 

iv) pdbaa databases (ftp://ncbi.nlm.nm.g0v/blast/db/pdbaa.Z); 

v) staphylococcus aureus NCTC 8325 
20 (ftp://ftp.genome.ou.edu/pub/staph/staph-lk.fa); 

vi) streptococcus pyogenes 
(ftp://ftp.tigr.org/pub/data/s_pneumoniae/gsp.contigs. 1121 97.Z); 

vii) PRODOM 

( ftp://ftp.toulouse.inra.fr/oub/orodom/current release/prodom99. 1 .forblast.ez): 
25 viii) DOMO rfrp://ftp.tnfobiogen.fr/pub/db/domo/ ^: 

ix) TREMBL (ftp://www.expasy.ch/databases/sp_tr_nrdb/fasta/) 

The results of the homology searches performed on the ORFs of bacteriophage 
Dp-1 are shown in Table 31. 

30 

Example 1$, §ub-qonjng pf Bacteriophage Dp-1 ORF?, 
Preparation of (he shuttle expression vector 

Expression preferably utilizes a shuttle expression vector which is arranged 
such that expression of the exogenous bacteriophage Dp-1 ORF sequence is inducible. 
35 For example, the plasmid pLSE4 replicates in E. coli y and S. pneumoniae (Diaz- and 
Garcia, 1990). This plasmid can be modified by conventional techniques to insert the 
inducible arsenite promoter, derived from the shuttle vector pT0021, in which the 
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firefly luciferase (lucFF) expression is controlled by the ars promoter/operator from a 
S. aureus plasrnid (Tauriainen, S., Karp, M., Chang, W and Virta, M. (1997). 
Recombinant luminescent bacteria for measuring bioavailable arsenite and antimonite. 
Appl. Environ. Microbiol. 63:4456-4461). This modified shuttle vector will contain 
5 the ars promoter, arsR gene and a cloning site for introduction of individual phage 
ORFs downstream from a shine-dalgamo sequence. 

Other inducible regulatory sequences can be utilized instead of the arsenite- 
inducible system. An example is a nisin-inducible system The nisA promoter activity 
is dependent on the proteins NisR and NisK, which constitute a two-component signal 

10 transduction system that responds to the extracellular inducer nisin. The nisin 

sensitivity and inducer concentration required for maximal induction varies among the 
strains, but is functional in Streptococcus pyogenes. Streptococcus agalactiae, 
Streptococcus pneumoniae, Enterococcus faecalis, and Bacillus subtilis. Significant 
induction of the nisA promoter (10- to 60-fold induction) can be obtained in all of the 

1 5 species. A vector containing this promoter was published as Eichenbaum Z, Federle 
MJ, Marra D, de Vos WM, Kuipers OP, Kleerebezem M, and Scott JR (1998) Appl 
Environ Microbiol 64, 2763-2769. Other vectors, e.g., plasmids, can also be utilized 
which will allow replication and transcription in Streptococcus. 

Alternatively, a constitutive promoter can be used to drive expression 

20 of the introduced ORF, and compare cell growth to control bacterial cells containing 
the parental vector lacking any introduced phage ORF. Recombinant plasmids are 
introduced into S. pneumoniae R6 as previously described (Diaz and Garcia, 1990) 

Cloning of ORFs with a Shine-Dalgarno sequence 

25 ORFs with a Shine-Dalgarno sequence are selected for functional analysis of 

bacterial killing. Each ORF, from initiation codon to last codon (excluding the stop 
codon), will be amplified by PCR from phage genomic DNA. For PCR amplification 
of ORFs, each sense strand primer starts at the initiation codon and is preceded by a 
restriction site and each antisense strand starts at the last codon (excluding the stop 

30 codon) and is preceded by a different restriction site. The PCR product of each ORF 
will be gel purified and digested with the restriction enzymes with sites contained on 
the PCR oligonucleotides. The digested PCR product is then gel purified using the 
Qiagen kit, ligated into the modified shuttle vector, and used to transform bacterial 
strain DHlOp. Recombinant clones are then picked and their insert sizes confirmed 

35 by PCR analysis using primers flanking the cloning site as well as restriction- 1_ - 
digestion. The sequence fidelity of cloned ORFs will be verified by DNA sequencing 
using the same primers as used for PCR. In the cases that the verification of ORFs 
can not be achieved by one path of sequencing using primers flanking the cloning site 
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internal primers will be selected and used for sequencing. Recombinant plasmids will 
be introduced into S. pneumoniae R6 as previously described (Diaz and Garcia, 1990). 
Induction of gene expression from the ars promoter. 

If an inducible promoter is used, e.g., the ars promoter, induction can be 
5 assessed, for example, in either of the two methods. 

1. Screening on agar plates 

The functional identification of killer ORFs can be performed by spreading an 
aliquot of S. pneumoniae transformed cells containing phage Dp-1 ORFs onto agar 
plates containing different concentrations of sodium arsenite (0; 2.5; 5; and 7.5 uM). 
10 The plates are incubated overnight at 37°C, after which a growth inhibition of the 
ORF transformants on plates that contain arsenite are compared to plates without 
arsenite. 

2, Quantification of growth inhibition ir\ liquid rnediurn. 

Cells containing different recombinant plasmids can be grown for overnight at 

1 5 37°C in LB medium supplemented with the appropriate antibiotic selection. These are 
then diluted to the mid log phase (OD M0 -.2) with fresh media containing antibiotic 
and transferred to 96-well microtitration plates (100 jil/well). Inducer is then added at 
different final concentrations (ranging from 2.5 to 10 uM) and the culture incubated 
for an additional 2 hrs at 37°C. The effect of expression of the phage Dp-1 ORFs on 

20 bacterial cell growth is then monitored by measuring the OD 540 and comparing the rate 
of growth to the culture not containing inducer. [As positive controls for growth 
inhibition, the HIA gene of phage lambda (Reisinger, GR., Rietsch, A., Lubitz, W. and 
Blasi, U. 1993 Virology #193: 1033-1036), and the holin/lysin genes of the 
Staphylococcus aureus phage Twort (Loessner, ML, Gaeng, S., Wendlinger, G., 

25 Maier, SK. and Scherer, S. 1998. FEMS Microbiology Letters #162:265-274) can be 
subcloned into the ars inducible vector. An aliquot of the induced and uninduced 
culture can also be plated out on agar plates containing an appropriate antibiotic 
selection but lacking inducer. Following incubation overnight at 37°C, the number of 
colonies is counted. Any ORF showing bacteriostatic activity will show a lower, but 

30 detectable, number of colonies on the agar plates when grown in the presence of 
inducer as compared to when grown in the absence of inducer. Any ORF showing 
full bacteriocidal activity will show no colonies on the agar plates, when grown in the 
presence of inducer as compared to when grown in the absence of inducer. 
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All patents and publications mentioned in the specification are indicative of 

10 the levels of skill of those skilled in the art to which the invention pertains. All 

references cited in this disclosure are incorporated by reference to the same extent as 
if each reference had been incorporated by reference in its entirety individually. 

One skilled in the art would readily appreciate that the present invention is 
well adapted to carry out the objects and obtain the ends and advantages mentioned, 

1 5 as well as those inherent therein. The specific methods and compositions described 
herein as presently representative of preferred embodiments are exemplary and are not 
intended as limitations on the scope of the invention. Changes therein and other uses 
will occur to those skilled in the art which are encompassed within the spirit of the 
invention are defined by the scope of the claims. 

20 It will be readily apparent to one skilled in the art that varying substitutions 

and modifications may be made to the invention disclosed herein without departing 
mom the scope and spirit of the invention. For example, those skilled in the art will 
recognize that the invention may suitably be practiced using a variety of different 
bacteria, bacteriophage, and sequencing methods within the general descriptions 

25 provided. 

The invention illustratively described herein suitably may be practiced in the 
absence of any element or elements, limitation or limitations which is not specifically 
disclosed herein. Thus, for example, in each instance herein any of the terms 
"comprising," "consisting essentially of and "consisting of* may be replaced with 

30 either of the other two terms. The terms and expressions which have been employed 
are used as terms of description and not of limitation, and there is not intention that in 
the use of such terms and expressions of excluding any equivalents of the features 
shown and described or portions thereof, but it is recognized that various 
modifications are possible within the scope of the invention claimed. Thus, it should 

35 be understood that although the present invention has been specifically disclosed by 
preferred embodiments and optional features, modification and variation of the _^ 
concepts herein disclosed may be resorted to by those skilled in the art, and thafsuch 
modifications and variations are considered to be within the scope of this invention as 
defined by the appended claims. 
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In addition, where features or aspects of the invention are described in terms of 
Markush groups or other grouping of alternatives, those skilled in the art will 
recognize that the invention is also thereby described in terms of any individual 
member or subgroup of members of the Markush group or other group. For example, 
5 if there are alternatives A, B, and C, all of the following possibilities are included: A 
separately, B separately, C separately, A and B, A and C, B and C, and A and B and 
C. Thus, for example, for the bacteria and phage specified herein, the embodiments 
expressly include any subset or subgroup of those bacteria and/or phage. While each 
such subset or subgroup could be listed separately, for the sake of brevity, such a 
1 0 listing is replaced by the present description. 

Thus, additional embodiments are within the scope of the invention and within 
the following claims. 
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Tabje 1 



Phages against human and animal pathogenic bacteria 



I. Pathogen 
name 


Phage name 


n. Cat 
alo 


Origin/reference 


Acinetobacter 
calcoaceticus 


A3/2 

AlO/45 

A36 

B9GP 

B,PP 

BS46 

E13 

E14 

531 




Felix d'Herelle Reference 
Centre,Quebec,Quebec 


Ao3 
P78 




J. Bacteriol 1984. 157: 179-183 

J. Gen. Microbiol 1986.132: 2633-2636 


Acinetobacter 
haemoiyticus 






Felix d'Herelle Reference 
Centre,Quebec,Quebec 


Acinetobacter 
johnsonii 






reiix u ncrciic rvciciciiLc 

Centre.Quebec,Quebec 


Acinetobacter sp. 


BP1 




J.Virol.l968.2:716-722 


G4, HP2, HP3 & 
HP4 




CanJ.Microbiol.1966. 12: 1023-1030 & 
J.Virol. 1974. 13:46-52 & 
Arch.Virol.l994.135:345-354 


A1.A4, A9& 
196 




Arch.Virol.l994.135:345-354 


HP1 




Can.J.Microbiol.l966.12:1023-1030 


A19.A23.A29, 
A31,A33,A34, 
A3759&2845 




J.Microsc (Paris) 1973.16:215-224 & 
CR.Hebdo Seances Acad.Sci.Ser D.Sci 
Natur(Paris)278:1907-1909 & 
Arch.Virol.l994.135:345-354 & 
Rev.Can.Biol. 1970.29:3 1 7-320 


ActinobaciUus 
actinomycetecomitans 






FEMS Microbiol Lett 1 994. 1 19:329^337. " 
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lofec. Immun, 1982. 35: 343-349 








Mol.Gen.Genet 1998.258: 323-325 




Aa<p247 




Oral Micriol. Immunol 1997.12: 40-46 


Actinomyces viscosus 




43146-B1 


The American Type Culture Collection 








mfectlnimun. 1 985.48:228-233 








imecuinunun.i7oo. jo. j**- j y 








T)|_ __: J lQo*7 1*7.1 At 1 CI 

fiasnua iyy/.i/:i4i-jjj 


Aeromonas hydrophila 


PM2** &PM3 




FEMS MicrobioLLett. 1990.57:277-282 


Aehl 




Felix d'Herelle Reference 




Aeh2 




Centre,Quebcc,Quebec 




PM4 








PM5 








PM6 








T7-ah 
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Aeromonas 
salmonicida 


3 

25 
29 
31 
32 

40RR2. 8 t 

43 

51 

56 

59.1 

65 

Asp37 




Felix d'Herelle Reference 
Centre,Quebec,Quebec 




55R.1 




Can. J. Microbiol. 1983. 29: 1458-1461 


Alteromonas espe/iana 


PM2** 


27025-B1 


The American Type Culture Collection 


Asticacaulis 
biprosthecum 






Felix d'Herelle Reference 
Centre.Quebec.Quebec 


Asticcacaulis 
excentricus 


4>Ac24 


15261-B1 
15261-B2 
15261-B3 


The American Type Culture Collection 


Azotobacier vinelandii 


A 14 
A21 

A 1 1 

Ail 
PAVl 


12518-B1 

12518-B4 

12518-B5 

12518-B9 

125I8-B10 

13705-B1 


The American Type Culture Collection 


Azotobacter sp. 






Virology 1972.49:439-452 


Bacteroides fragilis 


Bf-1 




Rev. Infect. Dis. 1979. 1: 325-336 




B40-8 




FEMS Microbiol. Lett 1991. 66: 61-67 




HSP40 




Appl. Environ. Microbiol. 1989. 55: 2696- 
2701 




phiAl 




Zentralbl.bakteriol. 1972.222:57-63 


Bdellovibrio 
bacteriovorus 


MAC-l 




J. Gen. Microbiol. 1987. 133: 3065-3070 


Bdellovibrio sp. 


VL-I 




J.Virol.1973. 12:1522-1533 


Bordetella 
brochiseptica 


214 




Zh.Mikrobiol.Epidemiol.Inanuno. 1987.5:9- 
13 
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Bordetella 
parapertussis 






Felix d'Herelle Reference 
Centre,Quebec,Quebec 








Mol. Gen. Mikrobiol. Virusol. 1988.4: 22-25 








Zh.Mikrobiol.EpidemioHmmuno. 1 987.5 ;9- 
13 




41405 




Zb.Mikrobiol.Epidemiol.Immuno. 1987.5:9- 
13 


Brucella abortus 






Felix d'Heielle Reference 
Centre,Quebec,Quebec 

- 


■ 




23448-Bl 
23448-B2 
23448-B3 
17385-Bl 
17385-B2 


The American Type Culture Collection 




10/1 
24/H 
212/XV 








BK-2.TB & 
Fi»* 




Zh.Mikrobiol.Epidemiol.Immunobiol. 1 983 2 : 
48-52 




R/c & RJO 




Dev Biol Stand 1984 56- 55-62 . * ■ 

\rf ▼ » *J Kill L\A • A ✓ W » • ^ V» » — ' Vl» ♦ » 


Brucella canis 


R/c 




Dev. Biol. Stand. 1984.56: 55-62 


Brucella melitensis 


BK-2 


23456-B1 


The American Type Culture Collection 


Brucella suis 


Wb 




Zentra!bLVeterinarmed.l975.22:866-867 
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Fi** & TB 




Zh.Mikrobiol.Epidemiol.Inununobiol. 1 983.2: 








48-52 


Brucella sp. 






Can. J. Vet. Res. 1989.53: 319-325 








Res. Vet Sci. 1988. 44: 45-49 




R 




Zh.MikTtobio].Epidemiol.Inmiunobiol. 1 983.2: 








48 


Campylobacter coli 




43133-B1 


The American Type Culture Collection 






43134-B1 




Campylobacter coli 


18 


43135-B1 


The American Type Culture Collection 


(Cont'd) 


19 


43136-B1 






20 






Campylobacter jejuni 


1 


35918-B1 


The American Type Culture Collection 




2 


35919-Bl 






3 


35920-B1 






4 


35921-B1 






5 


35918-B2 






6 


35920-B2 






7 


35922-B2 






8 


35923-B1 






9 


35924-BI 






10 


35925-Bl 






11 


35925-B2 






12 


35922-B2 






13 


35924-B2 






14 


35922-B3 






17 


43133-B1 






18 


43134-B1 






19 


43135-B1 






20 


43136-B1 




Campylobacter 


HPi 




J. Med. Microbiol. 1993. 38: 245-249 


(Helicobacter) pylori 








Chlamydia psittaci 


Chpl** 




J. Gen. Virol. 1989. 70: 3381-3390 


Clostridium 


CAK-1 




J.Bacteriol. 1 993. 1 75:3838-3843 


acetobutvlicum 
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Clostridium botulinum 


- 




Nucleic Acids Res. 1 990. 1 8: 1 29 1 








Bioch.Biophys.rcs.Coramun. 1 990. 1 7 1 . 1 304- 
1311 








MicrobioLimmunol. 1 98 1 .25:9 15-927 








J.VetMedSci.l992.54:675-684 




CEB &CEy 






Clostridium difficile 


41 & 56 




J. CliniMicrobiol. 1985.21:251-254 ! 
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Clostridium 






Rev.Can.Bio!. 1 977.36:205-2 1 5 


perfringens 














FEMS Microbiol Lett. 1990 54*323-326 


Clostridium 




8074-B1 


The American Type Culture Collection 


sporogenes 


59 


17886-B1 






70 


17886-B3 






71 


17886-B4 






72S 


17886-B5 






72L 


17886-B6 




Clostridium tetani 


A&B 




Rev.Can.BioM978.37:43-46 


Corynebacterium 






Vopr.VirusoL1986.31:577-584 


diphteriae 








Corynebacterium 


NN 


12319-B1 


The American Type Culture Collection 


pseudotuberculosis 








Corynebacterium sp 


DLC 2921/49 


12052-B1 


The American Type Culture Collection 



WO 00/32825 
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Enterococcus faecalis 


42 


19948-B1 


The American Type Culture Collection 


Enterococcus faecium 


124 
133 


19950-BI 
19953-b2 
19953-B1 


The American Type Culture Collection 



WO 00/32825 



PCT/IB99/02040 
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Escherichia coli 




U303-B14 

I1303-BIO 

11303-B2I 

8677-B1 

11303-B13 

13706-B4 


The American Type Culture Collection 


Escherichia coli 




15766-B1 


The American Type Culture Collection 






15766- B1 
1242-B5 
15669-B2 

15767- Bl 
11303-B16 
27-65-B1 






C204 
El 

n** 

f2** 

FCZ 
fd** 


25065-B2 
15669-B1 






15597-B1 






21816-B1 






23724-B9 
15593-B1 






25404-Bl 

29746-Bl 

2363 l-Bl 

25868-Bl 

25298-Bl 

25298-B2 

11303-B37 

11303-B24 






Ifl** 


11303-B26 






11303-B27 
1I303-B28 
11303-B29 
1 1303-B30 
11303-B33 
1I303-B31 
11303-B25 
U303-B35 






MS2** 

MU9 

Mu-1 

Ox6 

PI** 

P4 sicV* 

O-B** 

R17** 

Z1K/1 

ZJ/2 


1 1303-B34 
U303-B36 
11303-B32 






13706-B5 
11303-B1 






U303-B2 
11303-B3 
U303-B4 
35060-B1 
35060-B2 






35060-B3 

11303-B5 

11303-B6 

1I303-B7 

11303-B38 

12141-Bl 





WO 00/32825 



PCT/IB99/02040 
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Escherichia coli 
(Cont'd) 



547 

uvi 

UV47 
UV375 
a3** 
X . ** 
XC-17 
X sus P-3 
X sus R-5 
X sus J-6 
X sus 0-8 
X sus A-1 1 
Xtnd" 

<0C174" 
^Xcs70am-3 


11303-B20 

U303-B17 

11303-B15 

11303-BU 

11303-B18 

13706-B2 

23724-B2 

23724-B1 

23724-B3 

23724-B4 

23724-B5 

23724-B6 

23724-B7 

23724-B8 

35860-B1 

13706-B3 

15597-B2 

13706-BI 

49696-B1 


The American Type Culture Collection 






G4*» & 




Biochim-Biophysica Acta. 1992.1 130:277-288 


BF23** 




J.Bacteriol.l977.129:265-27S 


Mul 




J.UltrastructRes. 1 966. 14:441 -448 


Hpl7 




J.Mol.BioU991.2l8:705-721 


K3**&0x2" 




FEBSLett.l987.215:145-150 


Rbl8** Rb51& 
Rb69** 




IBacteriol. 1 990. 1 72: 1 80- 1 86 


H1*\H3,H8, 
K9, 

K.18 & Oxl 




MoLGcaGeneU 990.221:491-494 


Ml , lula** & 




J.Mol.Biol. 1987. 196: lo5-l /4 


K10 




JBacteriol 1979 140680-686 


Qsr' 




J.BacterioL1985.162:256-262 


QT70 




J.Lxen.MlCTODlOl.iyoo.1 J4.liJi-l Jjo 


pal oU 




rcMo MicroDioi.L.eu.ii*y». 1 iv. / j-/o 


phi ml 73 




Genetika 1985.21:673-675 


tf-1 




J.Gen.MicrobioU987.133:953-960 


P4&phiR73 




Mol.Microbiol. 1995. 18:201-208 


[ r 2 




J.Gen.MicrobioM982.128:2797-2804 


PRD1 




Virology 1990.177:445-451 


K3hx 




Mol.Gen.Genet 1 987.206: 110-115 


933J»*& 
933W** 




Infect.lrnmunity.l986.53:I3S-140 - ■ 


H19-B** 




J.Bacteriol. 1987. 169:4308-43 12 


Tcp-ni 




ZentraIbnl.BakterioI.Mikrobiol.Hyg.I988.270: 
41-51 



WO 00/32825 



PCT/IB99/02040 
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Escherichia coli 
(Cont'd) 



N4** 




Vet.Microbiol. 1992.30:203-212 


Phi 80 trp 




Ann.Inst.Pasteur. 1 971 .120: 1 2 1 - 1 25 


Obeta 1 




J.BactenoI. 1978. 133:172-177 


P1CM 




J.Cjcn.Microbiol.1978.107: /3-83 


PA-2** 




J .riactenol. lyyu. 172:1 ooO- 1 662 


186** 




MoI.Gen.Genet. 1982. 187:87-95 


1 86. IX.B 




MoLMicrobiol. 1 992.6:2 629-2 642 


21** 




Virology 1983.129:484-489 


P4** 




MicrobiolRev. 1993.57:683-702 


82** 




J.Btol.Chem. 1 987.262: 11 72 1 - 1 1 725 


PSP3 




J.BactenoI. 1 996. 1 78:5668-5675 


HK022** 




Nucleic Acids Res. 1994.22:354-356 


D108** 




Nucleic Acids Res. 1986. 14:38 13-3 825 


Rb49 




J.MoLBioL1997.267:237-249 


Ike** 




J.Mol.Biol.1985. 181:27-39 


P22dis 




Mol.Gen.Genetl978.166:233-243 


N15** 




J.BactenoI. 1 996. 1 78: 1484-1486 


in** 




Proc.R.SocXond.B.Biol.ScU991.245:23-30 


Stx2Phi-I & 




InfectImmun.l998.66:41(XM107 


Stx2Phi-II 






18 




ViroloRv 1987.156:122-126 


X 




J.Gen.Microbiol. 198 1 .126:389-396 


AC3 




Mol.Microbiol.I99 1 .5:7 1 5-725 









WO 00/32825 



PCT/IB99/02040 
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BW-1 




Felix d'Herelle Reference 




C-l 




Centre»Quebec,Quebec 




E920g 








Esc-7-U 








H19J 








Haiti 








HK243 








la 








K20 








K30 








KL 3 








M 








Mu** 








O103 








0157:H7 








P1D 








ptl 








PtIHa 








PR64FS 








PR772 








SS4 








(J4Q 








Xvir** 








Q8 








09-1 


• 






92 






Haemophilus 


HP1'* 




Nucleic Acids Res. 1996.24:2360-2368 


influenzae 


S2»* 




Gene 1997. 196: 139-144 


Hahbacterium 


S45 




Felix d'Herelle Reference 


cutirubrum 






Ccntre,Quebec,Quebec 


Hahbacterium 






Felix d'Herelle Reference 


halobium 






Centre,Quebec,Quebec 








Can. J.Microbiol. 1 982.28:916-92 1 


Hahbacterium 
salinarium 






Biol.Chem.HoppcSeyler 1 994.375:7*7-757 



WO 00/32825 



PCT/1B99/02040 
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Klebsiella oxytoca 


tf-I 




J.Gen.Microbiol. 1987. 1 33:953-960 


Klebsiella pneumoniae 


60 
92 


23356- Bl 

23357- B1 


The American Type Culture Collection 




K.19Q 




r eiix a nereiie Keierence 
Centre Ouebec Ouebec 




FC3-] & FC3-9 




Can J.Microbiol. 1991.37:270-275 




FC3-10 




rfcMb Mjcrobiol.Lett.iyy 1 .0 / .zv I -zy / 


Klebsiella sp. 


KU** 




Mol.Gen.Genet. 1 990.22 I:283-2ao 


Leptospira sp. 


LEI, LE3 & LE4 




Res.MicrobioU990.141;l 131-1138 


Listeria 


243 


23074-B1 


The American Type Culture Collection 


monocytogenes 


197,1313 & 
9425 




Appl.Environ.Microbiol.l997.63:3374-3377 




H387 & H387-A 




AppLEnviron.MicrobioU993.59:2914-2917 




5775,6223 
&12682 




APM1S.1993.101:160-167 




2389, 2671, 
4211 & 2685 




mtervirology 1994.37:31-35 & 
ZentraIbLBakteriol.Mikrobiol.Hyg. 1 986.26 1 : 1 
2-28 




4b.4ab,4 R &3c 




AniLMicrobioUParis) 1977.128:185-198 




A118.A500& 
ASH** 




MoIJvficrobiol. 1995.16:1231-1241-992 




1,3,4, 5, 6, 7, 8, 
9,10,11,14, 15, 
16. 17. 19&20 




AmMicrobiol. (Paris) 1979.130B:179-189 




l/2a, l/2b, 3c, 
4ab, 6a&6b 




CliiUnvestMed. 1984.7:229-232 




$LMUP35 
2685 




Felix d'Herelle Reference 
Centre,Quebec,Qucbec 


Listeria innocua 


4211 




Felix d'Hereile Reference 
Centre,Ouebec,Ouebec 


Micrococcus luteus 


N3 
N4 
N8 


4698-B1 
4698-B4 
4698-2 
4698-B3 


The American Type Culture Collection 


Micrococcus luteus 


N17 




Can.J.Microbiol. 1979.25:1027-1035 


KA\if*f\ bsnr t &ri t /m 
inynjuuaciiufn 

smegmatis 


Bo I** 

Bo 6 
Bo 611 
Bo 6m 

Mc-2 
Mc-4 
NN 

Phagus lacricola 
Rl 


27203- BI 

27204- Bl 

27205- Bl 
27205-B2 
27205-B3 
607-B6 
607-B7 
11727-Bl 
11759-B1 
607-B1 


The American Tvoe Culture Collection 
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HER 317 
HER 330 
HER 333 
HER 335 
HER 334 
HER 331 
HER 316 


Felix d'Herelle Refrence 
Centre,Quebec,Quebec 




Legendre 
Uo 
Roy 
Sedge 












Mol.MicrobioL1993.7:395-405 








J.Mol3ioL1998^79:143-164 








Proc.Natl.Acad.Sci USA.1988.84:2833-2837 








Mol. Biol. Rep. 1981.30:11-15 








Proc.NatLAcad.Sci.USA 1 997.94:10961- 
10966 




29M, 31M, 122, 
154, 37, 29D, 46, 
139,110, 141, 
74D, AG1 & 
DS6A 




Arch.Virol.l993.133:39-49 & 
Am.Rev.Respiri)is.l975.112:17-22 


Mycobacterium 
fortuitum 


Bo 4 
Bo 7 


23052-Bl 
27207-Bl 
27207-B2 


The American Type Culture Collection 



WO 00/32825 



PCT/IB99/02040 
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Mycobacterium leprae 






AniLMicrobiol. (Paris) 1982.133:93-97 


Mycobacterium 
tuberculosis 


DS6A 


25618-B1 
25618-B2 
4243-B1 


The American Type Culture Collection 




110, 139&33D 




Arch.ViroU993.133:39-49 




AG1,GS4E, 
BGl, 

PH&BKl 




The Biology of Mycobacteria.Acadernic 
Press.Toronto 1982 (Ratledge & Stanford) 
1982.309-351 


Mycobacterium sp 


Pfaagus pellegnni 

NN 

Bl 


11760- B1 

11761- B1 
23239-Bl 


The American Type Collection Culture 



WO 00/32825 



PCT/IB99/02040 
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TM4, pnoO, 

oh72 

PhAE39 

1 * U » 'Map' f 

phA£40 
&Bxbl 




\Ai^ r nUin\^r,y, 1QQ( 141 »1 1*71 11S1 

Microoioiogy iyyj.i**i.i 101 




C2 




Experentia 1969.25:1112-1113 




18 & 115 




J.Gen. Virol. 1987.68:949-956 




63 




Grozlica 1968.36:617-622 




phlei & 
butyricum 




J.Gcn.Virol.l975.29:235-238 




MyF3P-59a 




Z.AU«.Mikrobiol.l968.8:29-37 




Bo2a 




J.Gen. Virol. 1973.20:75-87 




D4.D28 & D32 




J.Exptl.Med. 1 966. 1 23 :327-340 




HC 




J.Bacteriol. 1963.86:608-609 


Mycobacterium 
vacate 


BS 


15483-B1 


The American Type Culture Collection 


Mycobacterium phlei 


NN 
Bo 2 
Bo2h 
Bo 3 


U728-B1 
11758-Bl 
27086-B2 
27086-Bl 


The American Type Culture Collection 


Mycoplasma 
arthritidis 


MAV1** 




fafecLlmmunity. 1995.63:4016-4023 


Mycoplasma hyorhinis 


Hr-1 




Arch.Virol.l983.77:81-85 


Mycoplasma 
pneumoniae 


Br-1 




Arch.ViroU983.75:M5 


Mycoplasma pulmonis 






Plasmid 1995. 33: 41-49 


Mycoplasma sp. 






i.GeiLMicrobiol. 1 985: 13 1:31 1 7-3 126 








J. ViroU986.59:584-590 








Gene 1994. 141: 1-8 



WO 00/32825 



PCT/IB99/02040 
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Microbios 1990. 64: 111-125 






InfectionA Immunity 1995. 63: 4016-4023 






MeABiol.1982.60: 116-120 


MV-L2& 




Arch.ViroU979.61:289-296 






Acta Virol 1978.22 443-450 






J.Gen.Virol.l979.42:315-322 






Virology 1973.55:118-126 



WO 00/32825 



PCT/1B99/02040 
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Science 1971.173:725-727 

- 


Neisseria perflava 






J.Clin.MicTobioLI976. 4:87-91 


Nocardia erythrypolis 


<pC 




J.Gen. Virol. 1974.23:247-254 




(pEC 




J.Bacteriol. 1976. 126:1 104-1107 


Pasteurella multocida 


B225 




Arch.Exp.Veterinannedl981.35:433-436 




B939a 




Am J. VeLRes. 1978.39: 1565-1 566 




Nos.115, 32, 967 
& 

1075 




VetMed.Naulri. 1977.14:33-36 


Propionibactcrium 
acnes 


NN 


29399-B1 


The American Type Collection Culture 



WO 00/32825 



PCT/IB99/02040 



130 



Pseudomonas 
aeruginosa 



2 

2A 

2B 

11 

16 

24 

27 

44 

73 

95 

109 

113 

249 

B3 

Hoff2 

Hoff3 

Pa 

Pb 

PB-1 

Pc 

Pf 

PP7** 



7&31 



Pf3** 



(p-MC 



pn« 



PR4** 



A7 



KF1 



<CTX« 



a** 



12175-B1 
I2175-B2 
12175-B3 
12175-B4 

14205- B1 

14206- B1 

14207- B1 

14208- B1 

14209- B1 

14210- Bl 

14211- B1 

14212- Bl 

14213- B1 

14214- B1 
15692-B1 

14203- B1 

14204- B1 
12055-B1 
12055-B2 
15692-B3 
12055-B3 
25102-B1 
I5692-B2 



The American Type Culture Collection 



Felix d' Here lie Reference 
Centre, Quebec, Quebec 



J.ViroU983.47:221-223 



CanJ.MicrobioU969.15:l 179-1186 



J.Mol.BioU99t.218:349-364 



J.Gen.ViroU979.43:583-592 



J.Bacteriol. 1 992. 1 74:2407-241 1 



J.Biochem. 1983.93:61-71 



MoLMicrobiol. 1993.4: 1703- 1709 



J.ViroU977.24:135-141 



WO 00/32825 



PCT/IB99/02040 
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dd< 



q>KZ,21,<pNZ, 
PMNI7, PTB80, 
68, PB-1, E79, 
16, 

109, 352, 1214, 
F8,71,337,M4, 
cpC17, SL2.B17, 
Li-24, tpmnP78, 

PS17**,q>l,73, 
M6, Li-2, 7, 
<pmnF82, 
PTB2, PTB20, 
PTB42, (pKF77, 
31.PTB21, 
119x, 

(pPLS27, B3, 
258, 

Hwl2,PM57, 
PM62, PM105, 
148, PM681, 
198, 

218, 222,242, 
246, 

PC131,<pCll, 
SL5, 

D3I12** Jbl9, 
F7, 

PM69, PM13, 

PM61,PMU3, 

g>240,249&269 



WO 00/32825 



PCT/IB99/02040 
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Pseudomonas 


297,309,318, 




Arch.Virol.l993.131:141-151 


aeruginosa 


n, 






(Cont'd) 

















WO 00/32825 
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PCT/IB99/02040 



Pseudomonas cepacia 






reux a rierene Keierence 
Centre,Quebec,Quebec 


Pseudomonas Jragi 


wy 


27362-Bl 
27363 Bl 


The American Type Culture Collection 


Pseudomonas 
phaseolicola 






reux a ncrcuc rvcicienwc 
Centre,Quebec,Quebec 


Pseudomonas putida 


fih-1 


12633-B1 


The American Type Culture Collection 


Pseudomonas syringae 




40492-B1 
21781-B1 


The American Type Culture Collection 


Pseudomonas sp. 


PPs-G3 


49780-B1 


The American Type Culture Collection 


Salmonella bareilly 


Sab 2 




Felix d'Herelle Reference 
CentrcOuebecOuebec 


Salmonella enteritidis 


1.2.3&6 




EpidemioLInfectl995.1 14:227-236 


2a, 3a, 4a, 5a, 6a, 
7a, 8a, 9a, 15, 
19, 20&21** 




VetMed.Nauki.1975. 12:55-60 


Salmonella newington 


Epsilon 34 




J.StructBiol. 1995.115:283-289 


Salmonella newport 


16-19 


27869-B1 
27869-B2 

- 


The American Type Culture Collection 






* 


Felix d'Herelle Reference 
Centre.Quebec.Quebec 


Salmonella paratyphi 


Paratyphoid A 


19940-B1 
12176-B1 


The American Type Culture Collection 




Jersey 




reux a Here lie Reference 
Centre Ouebec Ouebec 




Salmonella 

vpvt (ton hero 


oaSLI, oaLZ, oal 

3 

SaL4, SaL5 & 
SasL6 




Indian J Medites 1997.105:47-52 


Salmonella 
typhimurium 


P22** 
SL-1 


19585-B1 
40282 


The American Type Culture Collection 


MB78** 




J.Virol. 1982.41:1038-1043 - - " 




SE1 




J.Gen.Microbioi. 1986. 132:1035-1^41 




LT2 




Virology 1971.45:835-636 




ES18** 




Virology 1970.42:621-632 




L** 




J.Virol.l985.56:1034-1036 



WO 00/32825 
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P1CM clr-100 




Mol.Gen.Genet.l975.138:l 13-126 




F22 




GenetRes. 1986.48: 139-143 




Fclsl 




J.Gen. Virol. 1978.38:263-272 




Fels2 




Genet.Res. 1986.48: 139- 143 




Px 




Mol.Gen.Genet. 1 970. 108: 1 84-202 




Pike 




Virology 1974.60:503-514 




A3&A4 




J.Bacteriol. 1987.169:1003-1009 




HT 




Genet.Res. 1 976.27:3 1 5-322 


Salmonella 


IRA 




J.Basic Microbiol. 1990.30:707-716 


typhimurium 


Mudl 




Mol.Gen.Genet. 1986.202:327-330 


(Cont'd) 


P22 (cir4-l,cir5- 




Mol.Gen.Genet. 1984. 198: 105-109 




l&cir6-l) 








BF23** 




Mol.GeiuGenet 1976. 147:195-202 




Kbl 




J.Bacteriol.l974.117:907-908 




P221dis 




J.Gen. Virol. 1 978.4 1 :367-376 




PRD1** 




Virology 1990.177:445-451 




lr2** 




J.Gen.Microbiol.l982.128:2797-2804 




tf-1 




J.Gen.Microbiol. 1987.133:953-960 




X** 




J.Gen.Microbiol. 1981.1 26:389-396 


Salmonella 


g 


19937-B1 


The American Type Culture Collection 


tvohosa/tvuhi i 




1Q93J4-B1 
i yy jo~o i 








19939-B1 






46 


19942-d 1 






53 


19943-B1 






163 

1 vJ 


19946-B1 






1 /j 


19947-BI 






Vil 

vu 


27870-B1 






V1V1 


27870-B2 










Felix d'Herelle Refrence 








Centre,Ouebec,Ouebec 




vin 




Chung Hua Liu Hsing Ping 








H.T.C.1992. 13:288 




\2 




J.Gen.Microbiol.l983.129;3395-33400 


Salmonella sp. 


P3 


25957-B1 


The American Type Culture Collection 




P4** 


25957-B2 






P9a 


25957-B3 






P9c 


25957-B4 






P10 


25957-B5 






102 


19945-B1 






Chi(x) 


9842-B1 






R34 


97541 






MG40 




Virology 1968.34:521-530 




P14 




Microb.Pathog. 1990.8:393-402 




PSP3 




Virology 1992.188:414 . - 




Ike** 




Zentralbl.Bakteriol. 1 976.234:294-304 




P27&9NA 




J.Virol. 1986. 12:92 1-931 


Sphaerotilus natans 


SN1 




ADPl.Environ.Microbiol.l979.37:1025-1030 



WO 00/32825 PCT/1B99/02040 
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Shigella dysenteriae 




23351-B1 


The American Type Culture Collection 




P2 


1 1456b 






rf-80 


U456a-Bl 




Shigella flexeneri 


D20 


I2661-B1 


The American Type Culture Collection 


Sfll** 




Mol.Microbiol. 1997.26:939-950 




SfV** 




Gene 1997.22:217-227 




Sf6** 




Mol.Microbiol. 1995. 18:201-208 




SfX 




Gene 1993.129:99-101 


onigetia sonnei 








Ufa 




MoLBiol (Mosk) 1977.11:323-331 


Shigella sp 


37 


23354-B1 


The American Type Culture Collection 


Spiroplasma citri 


SpVl 




Plasmid 1993.29:193-205 


Spiroplasma sp. 


SoVl-R8A2B 




Nucleic Acids Res. 1990.18:1293 


SpV3 




Isr.J.MeAScU987.23:429-433 




SoV4 




JBacteriol. 1987. 1 69:4950^1961 


Staphylococcus albus 






Staphylococci & Staphylococcal 






Infections. 1997. 








Voll:503-508(Karger,Basel) 











WO 00/32825 



PCT/IB99/02040 
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Staphylococcus aureus 





27702-B1 




27703-B1 




27704-B1 




23360-B1 




23361-B1 


15 


27705*B1 


17 


27712-B1 


29 


27690-B1 


42D»* 


27691-B1 


42E ' 


27692-B1 


47 


27693-B1 


52 


27694-B1 


52A 


27695-B1 


53 


27696-B1 


54 


27697-Bl 


55 


27698-B1 


71 


27699-B1 


75 


27693-B2 


77 


27700-B1 
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Table 2 



bacteriophage 77, complete genome sequence, 41708 nucleotides 

1 gatcaaaata cttggggaac ggttagggag taaacttcgc gataatttta aaaattcatg 

61 tataaccccc ctcttataac cattttaagg caggtgatga aatggagatt atagtcgatg 

121 aaaatttagt gcttaaagaa aaagaaaggc tacaagtatt atataaagac atacctagca 

181 ataaattaaa agtagttgat ggtttaatta ttcaagcagc aaggctacgt gtaatgcttg 

241 attacatgtg ggaagacata aaagaaaaag gtgattatga tttatttact caatctgaaa 

301 aggcgccacc atatgaaagg gaaagaccag tagccaaact atttaatgct agagatgctg 

361 catatcaaaa aataatcaaa caattatcgg atttattgcc cgaagagaaa gaagacacag 

421 aaacgccatc tgatgattac ctatgattag taataaatac gttgatgaat atataaattt 

481 gtggaaacaa ggaaagataa ttttaaataa agaaagaatt gatctcttta attatctaca 

541 aaaacatata tattcacgag atgatgtata ttttgatgaa cagaaaatcg aggattgtat 

601 caaatttatt gaaaaatggt attttccaac attaccattt caaaggttta tcatagctaa 

661 tatatttctt atagataaaa atacagatga agctttcttt acagaatttg ctattttcat 

721 gggacgtgga ggcgggaaaa acggtctaat aagtgctatt agtgattttc tttctacgcc 

781 cttacacgga gttaaagaat atcacatctc cattgttgct aatagtgaag atcaagcaaa 

841 aacatcgttt gatgaaatca gaaccgtttt aatggataac aaacgaaata agacgggtaa 

901 aacgccaaaa gctccttatg aagttagtaa agcaaaaata ataaaccgtg caactaaatc 

961 ggttattcga tataacacat caaacacaaa aaccaaagac ggtggacgtg aggggtgtgt 

1021 tatttttgat gaaattcatt atttctttgg tcctgaaatg gtaaacgtca aacgtggtgg 

1081 attaggtaaa aagaaaaata gaagaacgtt ttatataagt actgatggtt ttgttagaga 

1141 gggttatatc gatgcaatga agcacaaaat tgcaagtgta ttaagtggca aggttaaaaa 

1201 tagtagattg tttgcttttt attgtaagtt agacgatcca aaagaagttg atgacagaca 

1261 gacgtgggaa aaggcgaacc caatgttaca taaaccgtta tcagaatacg ctaaaacact 

1321 gctaagcacg attgaagaag aatataacga tttaccattc aaccgttcaa ataagcccga 

1381 attcatgact aagcgaatga atttgcctga agttgacctt gaaaaagtaa tagcaccatg 

1441 gaaagaaata ctagcgacta atagagagat accaaattta gataatcaaa tgtgtattgg 

1501 tggtttagac tttgcaaaca ttcgagattt tgcaagtgta gggctattat tccgaaaaaa 

1561 cgatgattac atttggttag gacattcgtt tgtaagacaa gggtttttgg atgatgtcaa 

1621 attagaacct cctattaaag aatgggaaaa aatgggatta ttgaccattg tcgatgatga 

1681 tgtcattgaa attgaatata tagttgattg gtttttaaag gctagagaaa aatatgggct 

1741 tgaaaaagtc atagctgata attatagaac tgatattgta agacgtgcgt ttgaggatgc 

1801 tggcataaaa cttgaagtac ttagaaatcc aaaagcaata catggattac ttgcaccacg 

1861 tatcgataca atgtttgcga aacataacgt aatatatgga gacaatcctt tgatgcgttg 

1921 gtttactaat aatgttgctg taaaaatcaa gccggatgga aataaagagt atatcaaaaa 

1981 agatgaagtc agacgtaaaa cggatggatt catggctttt gttcacgcat tatatagagc 

2041 agacgatata gtagacaaag acatgtctaa agcgcttgat gcattaatga gtatagattt 

2101 ctaatagagg aggtgagaca tgagtattct agaaaagata tttaaaacta ggaaagatat 

2161 aacatatatg cttgatttag atatgataga agatctatca caacaagcgt atgtgaaacg 

2221 tttagcgatt gatagttgta ttgaatttgt tgcgcgagct gtcgctcaaa gtcattttaa 

2281 agtattggaa ggtaatagaa ttcaaaagaa tgatgtttac tacaagttaa atataaaacc 

2341 aaatactgac ttatcaagcg atagtttttg gcaacaagtt atatataaac taatttatga 

2401 taacgaggtt ttaatcgtag taagtgacag caaagaatta cttatcgcag atagctttta 

2461 cagagaagag tacgctttgt atgatgatat attcaaagat gtaacggtta aagattatac 

2521 ttatcaacgt actttcacaa tgcaagaggt catatattta aagtacaaca acaataaagt 

2581 gacacacttt gtagaaagtc tattcgaaga ttacgggaaa atattcggaa gaatgatagg 

2641 tgcacaatta aaaaactatc aaataagagg gattttgaaa tctgcctcta gcgcatatga 

2701 cgaaaagaat atagaaaaat tacaagcgtt cacaaataaa ttattcaata cttttaataa 

2761 aaatcaacta gcaatcgcgc ctttgataga aggttttgat tatgaggaat tatctaatgg 

2821 tggtaagaat agtaacatgc ctttttctga attgagtgag ctaatgagag atgcaataaa 

2881 aaatgttgcg ttgatgattg gtatacctcc aggtttgatt tacggagaaa cagctgattt 

2941 ggaaaaaaac acgcttgtat ttgagaagtt ctgtttaaca cctttattaa aaaagattca 

3001 gaacgaatta aacgcgaaac tcataacaca aagcatgtat ttgaaagata caagaataga 

3061 aattgtcggt gtgaataaaa aagacccact tcaatatgct gaagcaattg acaaacttgt 

3121 aagttctggt tcatttacaa ggaatgaggt gcggattatg ttaggtgaag aaccatcaga 

3181 caatcctgaa ttagacgaat acctgattac taaaaactac gaaaaagcta acagtggtga 

3241 aaatgatgaa aaagaaaaag atgaaaacac tttgaaaggt ggtgatgaag atgaaagcgg • 

3301 agattaaagg cgtcatcgtt tccaacgaag ataaatgggt ttacgaaatg cttggtatgg 

3361 attcgacttg tcctaaagat gttttaacac aactagaatt tagtgatgaa gatgttgata 

3421 ttataattaa ctcaaatggt ggtaacctag tagctggtag tgaaatatat acacatttaa 

3481 gagctcataa aggcaaagtg aatgttcgta tcacagcaat agcagcaagt gcggcatcgc 

3541 ttatcgcaat ggctggtgac cacatcgaaa tgagtccggt tgctagaatg atgattcaca 
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3601 atccttcaag tattgcgcaa ggagaagtga 

3 661 aacatgttgg tcaaataatg gctgaggcat 

3721 aacctacaga aaegatggct aaggaaacgt 

3781 gtcttgcgga tagtaaaatg tttgaaaacg 

3 841 aagtgttatc gaaagatgta ttaaatcgtg 

3901 ttaacattga tattgacgca atagcaaata 

3961 aggaatcaga aatcgatgtt gcagatagta 

4021 ttttttaata caaaaatagg aggtcacaaa 

4081 aacgcgaaaa acgaatttat taatgcagca 

4141 gaactgtacg gtgacatgat taaccaacca 

4201 gaagccgaaa gagtttctag tttacctaaa 

4261 aatttcttta tggatatcaa caagagtgtt 

4321 gaaacaattg atagaatctt cgaagateca 

4381 ggtactaaaa atgctggttt gcgtttgaag 

4441 gtttggggta aaatctatgg tgaaattaaa 

4S01 acagcaattc aaaataaatt gacagcgtte 

4561 ggtcctgcgc ggattgaaag atttgctcgt 

4621 cttgaaactg cgttcttaaa aggtactggt 

4681 gtacaaaaag gcgtatcggt aaccgatggt 

4741 cttacatttg ctaatccgcg cgctacggtt 

4801 tcaactaacg agaaaggtaa atcagtagcg 

4861 ccgtccgatg ctttcgaggt tcaagcacag 

4921 gttactgctt taccatttaa tttgaatgtt 

4981 gttttaacgt acgttaaagg tctatacgac 

5041 aaatttaaag aaacacttgc gttagatgat 

5101 tacggcaaag cgaaagataa taaagttgct 

5161 aaaccagctt tagaagatac cgaagaaaca 

5221 gaaatttaaa gttgttagag aatttaaaga 

5281 aggggagttg tatccagctg aagggtataa 

5341 aaccaaaaat aagtacgaca aagtttatat 

5401 attattagaa ctatgcgaat cattacaaaa 

5461 aatcatcgac ttattgaatg gtgaagacaa 

5521 aatcacttga aaagattgac cataattcag 

5581 tgtcgtacga gcgtataaaa aatcagtgcg 

5641 aagaaetgat acttatacgc gctagatatg 

5701 acaattacag acctgaaata atagattttt 

5761 aagaaagtgt ttaagaaacc tagaattaca 

5821 tataagcata ctgaaaataa tggtccagaa 

5881 agctgttggg cgagtattga tggtgtctgg 

5941 ggaacgcaaa atgacattaa aetgeatatt 

6001 gaagaacafct atctcgaaat tgaaccaaga 

6061 gtatcaccag atttggataa taaagacttt 

6121 gtgtgaaagt gacaggtgat aaagcattag 

6181 aagagatggt aaaagtccaa gataaggcgt 

6241 aaataaaaaa acaactcaaa ccttcagaag 

6301 gtactgaacc tgaatggata aaggggaaac 

6361 ttgaacgatt tagaatagta catttaatcg 

6421 aacccgcaaa acccaaagcc aegggeggga 

6481 agtattttga gacgctaaaa agggagttga 

6541 tcatgaagtg attagtcaag acagaattat 

6601 gttcaacaaa taccctaatg taaaagatac 

6661 cgacgaccca atacctacaa cttatactga 

6721 ccaaatagat gtttttgtta agtacaatga 

6781 gatatctaat cgcattcaaa agttattatg 

6841 tggaaaaccg gaatatatag aagaatttaa 

6901 catcctttat aaggaggaaa attaaatggc 

6961 cattaacatt actggtttag gtttcgctaa 

7021 tagtgatatt acaaaaacaa gaggattaca 

7081 aaaaacagct tatgctgatg gcggtccaat 

7141 aatctcatta caaatgcatg cgttccctaa 

7201 ttatgatgaa gatggcgttt acgaagagaa 

7261 atggttcaga caagagcgta aagacggtac 

7321 gtttacaaat cctaaaatcg atggagaaac 

7381 agaggttgaa ggcgaggcac ttttcccttt 

7441 tatctttgat tcagctaaca tgacaaatca 

7501 tttcttaaag aaaattttag gcgaagaata 

7561 aactttgtaa caaaaccggc cccatcggaa 

7621 agcattaaaa cacttaaagt tggcgacaca 
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aagacctaaa tcatgctgca gaaacatcag 
acgcggttag agctggtaaa aacaaacaag 
ggctaaatgc tgatgaagcc attgaacaag 
acaatatgca aattgcagca agcgatacac 
taacagcttt ggtaagtaaa acgccagagg 
aagcaattga aaaaataaat atgaaagaaa 
aattatcagc aaatggattt tcaagattcc 
acgactacaa aettatcgga aacattcgca 
aacaacggtg aaccgcaaga aagacaaaat 
Cttgaagaaa ctaaattaca agcaaaagca 
ccagcacaaa ctttgagtgc aaaccaaaga 
ggatataaag aagaaaaact tttaccagaa 
acaacgaatc atccattatt agctgactta 
ttcttaaaat ccgaaacttc tggcgtggct 
ggtcaattag acgctgcgtt cagtgaagaa 
gctgtcetac caaaagattt aaatgatttt 
gctcaaaccg aagaagcatt tgcagtggcg 
aaagaccaac cgattggctt aaaccgtcaa 
gcttatccag agaaagaaga acaaggtacg 
aatgaattga cgcaagtgtt taaataccac 
gttaaaggta atgtaacaat ggttgttaat 
tatacacatt taaatgcaaa tggcgtatat 
attgagtcta cagttcaaga agcaggtaag 
ggttatceag ccggtggcac taatgtccag 
atggatctat acactgcaaa acaatttgct 
gctgtttgga aattagattt aaaaggacat 
ctataaaatt ttatgaggtg ataaaatggt 
catagagcac aatcaacaca agtacaaagt 
caatcctcgt gttgaattgt tgacaaatca 
cgtaccttta gataagctga caaaacaaga 
aaaagcgtct agttcaatgg ttaaaagtga 
tgacgattga tgatttgctt gtcaaattta 
aggatgagta cttaaagcag ttgttaaaaa 
gagtttttga attagagaat ttaataggtc 
cttatcaaga tttattagaa cacttcaacg 
cgttacctct aatggaggta tcagaagatg 
actaaacgtt taaatacgcg tgttcatttt 
gctggagaaa aagaagaaaa attattatat 
ttacgtgaat tagaacaagc tatctcaaac 
cgtgacccgc aaggtgatta tttacccagt 
tatttcaaaa atcgtttgaa tataaagcaa 
attatgattc gcggaggata tagttcatga 
aaagagaatt agaaaaacat tttggcataa 
taatagctgg tgctaaggta attgttgaag 
actcaggagc actgattagt gagattggtc 
gtactgttac aattaggtgg cgtgggcctt 
aaaatggtca tgttgagaaa aagtcaggaa 
ttaatagagc aataagacaa gggcaaaaca 
aaaaattgtg attgatattt tgtacaaagt 
tagagagcac gtaaatatca ataatattaa 
tgatgtacct tctattgtta ttgacgatat 
cggagatgag tgtgcatata gttatattgt 
tgaatataat gcgagaatca taagaaataa 
gtctgaacta aaaatgggaa atgtttcaaa 
aacatacaga agctctcgcg tttacgaggg 
agcaaaacat gcaagtgcgc caaaggcgta 
attaacgaaa gaaggcgcgg aattaaaata 
aaaaatcggt gttgaaactg gtggagaact 
cgaatcaggg aatacagacg gagaaggtaa 
agagattcgc aaaattgttt ttaatgaaga 
acaaggtaaa caaaacaatt acgtagctgt 
atttagaaca gttttattac ctaaagttat- 
ggctgagaaa gattgggatt tctcaagtga 
agttgataat aaaaagtcag tacgtaagta 
tgatggagac ggtgaaaaag gcgaagaggc 
tactggaaac gtgacagagg gtaacgaaga 
accgcggtaa agtcggttaa tataccagat 
tacgatttaa atgttgtagt agagccatct 
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7681 aaccaaagta agttattgaa atacacaaca gatcaaacga atattgtatc aaccaatagt 

7741 gatggtcaag ctactgcgga agcacaaggc actgctacgg ttaaagcaac agetggeaat 

7801 atgagtgaca ctacaacaat aaacgtagaa gcataagagg gggcaacccc tctattttat 

7861 ttgaaaataa ggagagtatt ataaaacggc aaaactaaaa cgtaacatta ttcaattagt 

7921 agaagatcca aaagcaaatg aaactaaatt acaaacgcac ttaacaccac acttcatttc 

7981 atccgaaatc gcatacgaag caatggattt aatcgacgat attgaggacg aaaatagcac 

8041 gatgaagcca agagaaatcg ccgacagatt gatggatatg gctgtaaaaa tttacgataa 

8101 ccaattcaca gttaaagacc taaaagaacg tatgcatgca cctgatggaa cgaatgcact 

8161 tcgcgaacaa gtgattttca ttactcaagg tcaacaaact gaggaaacta gaaattttat 

8221 ccagaacatg aaataaagcc tgaagatcta acatataaag caacgttgaa aaatatggac 

8281 actctcatga tggacttaat tgaaaacggt aaagacgcta acgaagtttt aaaaatgcca 

8341 ctccattatg tgccttccat atatcaaaat aaaaataatg acatttctga agaaaaagca 

8401 gaggctttaa ttgatgcatt ctaaccttaa ccgtteggtt agggtcattt ttctgaactt 

8461 ttttagaaag gaggcaaaaa atgggagaaa gaaeaaaagg tttatctata ggtttggatt 

8521 tagatgcagc aaacttaaat agaccatttg cagaaatcaa acgaaacttt aaaactttaa 

8581 attctgactt aaaattaaca ggcaacaact tcaaatatac cgaaaaatca actgatagtt 

8641 acaaacaaag gattaaagaa cttgatggaa ctatcacagg ttataagaaa aacgttgatg 

8701 atttagccaa gcaatatgac aaggtatctc aagaacaggg cgaaaacagt gcagaagctc 

8761 aaaagttacg acaagaatat aacaaacaag caaatgagct gaattattta gaaagagaac 

8821 tacaaaaaac atcagccgaa cttgaagagt ccaaaaaagc tcaagttgaa gctcaaagaa 

8881 cggcagaaag tggctgggga aaaaccagta aagtttttga aagcatggga cctaaattaa 

8941 caaaaatggg tgatggttta aaatccattg gtaaaggttt gatgattggt gtaactgcac 

9001 ctgcttcagg cattgcagca gcaccaggaa aagcttttgc agaagttgat aaaggtttag 

9061 atactgttac tcaagcaaca ggcgcaacag gcagtgaact aaaaaaattg cagaactcat 

9121 etaaagatgt ttatggcaat tttccagcag atgctgaaac tgttggtgga gcttcaggag 

9181 aagttaatac aaggttaggt cttacaggta aagaacttga aaatgccaca gagtcattct 

9241 tgaaattcag tcatataaca ggttccgacg gtgtgcaagc cgcacagtta attacccgtg 

9301 caatgggcga tgcaggtatc gaagcaagtg aatatcaaag cgttttggat atggtagcaa 

9361 aagcggcgca agctagtggg ataagtgttg atacattagc tgatagtatt actaaatacg 

9421 gcgctccaat gagagctatg ggctttgaga tgaaagaatc aattgcttta ttctctcaat 

9481 gggaaaagtc aggcgttaat actgaaatag cactcagtgg tttgaaaaaa gctatatcaa 

9541 attggggtaa agctggtaaa aacccaagag aagaatttaa gaagacatta gcagaaattg 

9601 aaaagacgcc ggatatagct agcgcaacaa gtttagcgat tgaagcattt ggtgcaaagg 

9661 caggtcctga tttagcagac gctattaaag gtggtcgctt tagttatcaa gaatttttaa 

9721 aaactattga agattcccaa ggcacagtaa accaaacatt taaagattct gaaagtggct 

9781 ccgaaagatt taaagtagca atgaataaat taaaattagt aggtgctgat gtatgggctt 

9841 ctattgaaag tgcgtttgct cccgtaatgg aagaatcaat caaaaagcta tctatagcgg 

9901 ttgattggtt ttccaattta agtgatggtt ctaaaagatc aattgttatt ttcagtggca 

9961 ttgctgctgc aactggtcct gtagtttttg ggttaggtgc atttataagt acaatcggca 

10021 atgcagtaac tgtattagct ccactgttag ccagtattgc aaaggctggt ggattgatta 

10081 gttttttatc gactaaagta cctatattag gaactgtctt cacagcctta actggtccaa 

10141 ttggcattgt actaggtgta ttggctggtt cagcagtcgc atttacaatt gcttataaga 

10201 aatctgaaac atctagaaat tttgttaacg gtgcaattga aagtgttaaa caaacattta 

10261 gtaattttat tcaatttatt caacctttcg ttgattctgt taaaaacatc tttaaacaag 

10321 cgatatcagc aatagttgat tccgcaaaag atatttggag tcaaatcaat ggattcttta 

10381 atgaaaacgg aatttccatt gttcaagcac ctcaaaatat atgcaacttt attaaagcga 

10441 tatttgaatt tattctaaat tttgtaatta aaccaactat gttcgcgatt tggcaagtga 

10501 tgcaatttat ttggccggcg gttaaagcct tgattgtcag tacttgggag aacataaaag 

10561 gcgtaataca aggtgcttta aatatcatac ttggcttgat taagttcttc tcaagtctat 

10621 tcgttggtga ttggcgagga gtctgggacg ccgtcgtgat gattcttaaa ggagcagttc 

10681 aattaatttg gaatttagtt caattatggt ttgtaggtaa aatacttggt gttgttaggt 

10741 actttggcgg gttgctaaaa ggattgatag caggaatttg ggacgtaata agaagcatat 

10801 tcagtaaatc tttatcagca acttggaatg caacaaaaag tatcttcgga tttttattta 

10861 atagcgcaaa atcaattttc acaaatatga aaaattggtt atctaatact tggagcagta 

10921 tccgcacgaa tacaatagga aaagcgcagt catcatttag cggcgtcaaa ccaaaattta 

10981 ctaatttatg gaatgcgacg aaagaaactt ttageaatct aagaaattgg atgtcaaata 

11041 tttggaattc cattaaagat aatacggtag gaattgcaag ccgtttatgg agtaaggtac 

11101 gtggaatttt cacaaatatg cgcgatggct tgagttccat tatagataag attaaaagtc 

11161 ataccggcgg tatggtaagc gctattaaaa aaggacttaa eaaattaatc gacggtctaa 

11221 actgggtcgg tggtaagttg ggaatggata aaatacccaa gttacacact ggtacagagc 

11281 acacacatac taccacaaga tcagttaaga acggtaagat tgcacgtgac acattcgcta 

11341 cagttgggga taagggacgc ggaaatggtc caaatggtcc tagaaatgaa atgaccgaat 

11401 tccctaacgg taaacgtgta atcacaccca atacagatac taccgcttat ttacctaaag 

11461 gctcaaaagt atacaacggt gcacaaactt attcaatgtt aaacggaacg cttccaagat 

11S21 ttagtttagg cactatgtgg aaagatatta aatctggtgc atcatcggca ctcaactgga 

11581 caaaagataa aataggtaaa ggcaccaaac ggctcggcga taaagttggc gatgttttag 

11641 attttatgga aaatccaggc aaactttcaa atcatatact tgaagcttct ggaattgatc 

11701 tcaattcttt aactaaaggt atgggaactg caggcgacat aacaaaagcc gcacggtcta 
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11761 agattaagaa aagtgctact gattggataa aagaaaattt agaagctatg ggcggtggcg 

11821 attcagtcgg cggaacacta gaccccgaca aaactaatca tcatcatgga cgtaccgcag 

11881 cttataccgc cgcaactgga agaccacctc atgaaggtgt cgantttcca tttgtatatc 

11941 aagaagttag aacgccgacg ggtggcagac ttacaagaat gccatttatg cctggcggtc 

12001 acggtaatta tgtaaaaatc actagcggcg tcatcgatat gctatttgcg catctgaaaa 

12061 actttagcaa atcaccacct agcggcacga tggtaaagcc cggegatgtt gtcggtctaa 

12121 ctggtaacac cggatctagt acaggaccac acttacattt tgaaatgagg agaaatggac 

12181 gacactctga ccctgaacca tatttaagga atgctaagaa aaaaggaaga ceatcaacag 

12241 gtggeggcgg tgctacttct ggaagtggcg caacttatgc cagtcgagta atccgacaag 

12301 cgcaaagtat tttaggtggt cgttataaag gtaaatggat tcatgaccaa atgatgcgcg 

12361 ttgcaaaacg egaaagtaac taccagtcaa atgcagtgaa taaccgggat ataaacgctc 

12421 aaagaggaga cccatcaaga ggatcattcc aaatcaccgg ctcaactttt agagcaaacg 

12481 ctaaacgtgg atatactaac tttaataatc cagcacatca aggcatctca gcaacgcagt 

12541 acattgttag acgatatggt cggggtggtc ttaaacgtgc tggtgattac gcacatgcta 

12601 caggtggaaa agtttttgat ggttggcata acttaggtga agacggtcat ccagaatgga 

12661 ttattccaac agatccagcc cgcagaaatg acgcaatgaa gatettgcat tatgcagcag 

12721 cagaagtaag agggaaaaaa gcgagtaaaa acaagcgtcc tagccaatta tcagacctaa 

12781 acgggtttga tgatcctagc ttattactga aaatgattga acaacagcaa caacaaatag 

12841 ctttactact gaaaacagca caatctaacg atgtgatcgc agataaagat tatcagccga 

12901 ttattgacga atacgccttt gataaaaagg tgaacgcgtc tatagaaaag cgagaaaggc 

12961 aagaatcaac aaaagtaaag tttagaaaag gaggaattgc tactcaatga cagacactat 

13021 taaagegaac aacaaaacaa ttccttggtt gtatgtcgaa agagggcttg aaataccccc 

13081 ttttaattat gttttaaaaa cagaaaatgt agatggacgt tcggggtcta tatataaagg 

13141 gcgtaggctt gaatcttata gttttgatat acctttggtg gtacgcaatg aceatecatc 

13201 tcacaacggc attaaaacac atgatgacgt cttgaatgaa ttagtaaagt tttctaacca 

13261 cgaggaacaa gttaaattac aattcaaatc taaagattgg tactggaacg cttatttcga 

13321 aggaccaata aagctgcaca aagaatttac aatacctgtt aagttcacta ccaaagcagt 

13381 actaacagac ccttacaaat attcagtaac aggaaataaa aatactgcga tttcagacca 

13441 agtttcagtt gtaaatagtg ggactgctga cactccttta actgttgaag cccgagcaat 

13501 taaaccatct agttacttta tgactactaa aaatgatgaa gattatttta tggttggtga 

13561 tgatgaggta accaaagaag ttaaggatca catgcctcct gtttatcata gtgagtttcg 

13621 tgatttcaaa ggttggacta agatgactac tgaagacacc ccaageaatg acttaggtgg 

13681 taaggtcggc ggtgactttg tgatatccaa tcttggcgaa ggatataaag caactaattt 

13741 tcctgatgca aaaggttggg tcggtgctgg cacgaaacga gggctcccta aagcgatgac 

13801 agatctccaa aeeaectaea aatgtattgt tgaacaaaaa ggtaaaggtg ccggaagaac 

13861 agcacaacat atttatgata gtgatggcaa gtcacttgct tctattggtt atgaaaataa 

13921 atatcatgat agaaaaatag gacatattgt tgttacgttg tataaccaaa aaggagaccc 

13981 caaaaagata tacgactatc agaataaacc gataatgtat aacttggaca gaatcgttgt 

14041 ttatatgcgg ctcagaagag taggtaataa attttctatt aaaacttgga aatttgatca 

14101 cattaaagac ccagatagac gtaaacctat tgatatggat gagaaagagt ggacagatgg 

14161 cggtaagttt tatcagcgtc cagcttctat cacagctgtc tacagcgcga agtataacgg 

14221 ttataagcgg atggagatga atgggttagg ttcattcaac acggagactc taccgaaacc 

14281 gaaaggcgca agggatgtca ttatacaaaa aggtgattta gtaaaaatag atatgcaagc 

14341 aaaaagtgtt gtcatcaatg aggaaccaat gttgagcgag aaatcgtttg gaagtaatta 

14401 tttcaatgtt gattctgggt acagtgaatt aatcatacaa cctgaaaacg tctttgatac 

14461 gacggttaaa tggcaagata gatatteata gaaaggagat gagagtgtga tacatgtttt 

14521 agattttaac gacaagatta tagatttcct ttctactgat gacccctcct tagttagagc 

14581 gattcataaa cgtaatgtta atgacaattc agaaacgctt gaactgccca taccatcaga 

14641 aagagctgaa aagttccgeg aacgacatcg cgttattaca agggattcaa acaaacaatg 

14701 gcgtgaattt attattaact gggttcaaga cacgatggac ggccacacag agatagaatg 

14761 tatagcgtct catcttgctg acacaacaac agctaaaccg tatgcaccag gcaaatttga 

14821 gaaaaagaca acttcagaag cattgaaaga tgtgttgagc gatacaggtt gggaagteec 

14881 tgaacaaacc gaatacgatg gcttacgtac tacgtcatgg acttcctacc aaactagaca 

14941 tgaagtttta aagcaatcat gtacaaccta taaaatggtt ttagattttt atattgagct 

15001 tagctctaat accgtcaaag gtagatatgt agcactcaaa aagaaaaaca gcttattcaa 

15061 aggtaaagaa accgaacatg gtaaagattt agtcgggtea actaggaaga ttgacatgcc 

15121 agaaatcaaa acagcattaa ttgccgtggg acctgaaaat gacaaaggga agcgtttaga 

15181 gccagttgtg acagaegacg aagcgcaaag tcaattcaac ctacccatgc gctatacttg 

15241 ggggacatat gaaccacaat cagatgatca aaatatgaat gaaacacgat taagttcttt 

15301 agccaaaaca gagctaaata aacgcaagtc ggcagtcatg tcatatgaga ccacttctac 

15361 tgatttggaa gttacgtatc cgcacgagat tatatcaatt ggcgatacag tcagagtaaa 

15421 acatagagac tttaacccgc cattgtatgt agaggcagaa gttactgctg aagaatataa 

15481 cacaatttca gaaaacagca cataeacatc cggtcaacct aaagagtcca aagaaccaga 

15541 attacgagaa gagctcaaca agcgactgaa cataatacat caaaagctaa acgataatat 

15601 tagcaatatc aacactatag tcaaagacgt tgcagatggt gaattagaat accttgaacg 

15661 caaaatacac aaaagtgata caccgccaga aaacccagtc aatgatatgc tctggtatga 

15721 tacaagtaac cctgatgttg ctgtcttgcg tagatattgg aatggtcgat ggattgaagc 

15781 aacaccaaat gatgtcgaaa aattaggtgg tataacaaga gagaaagcgc tatccagcga 
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15841 actaaacaat atttttatta atttatctat acaacacgct agtcttttgt cagaagctac 

1S901 agaattactg aatagcgagt acteagcaga eaacgatctg aaagcggacc eacaagcaag 

15961 tttagacgct gtgatcgatg tttacaatca aattaaaaat aatttagaac ctatgacacc 

16021 cgaaactgca acgatcggtc ggtcggtaga tacacaagct ttatttcttg agtacagaaa 

16081 gaaattacaa gatgtetata cagatgcaga agatgtcaaa atcgccattt cagatagatt 

16141 taaaecatta cagtcacaat acactgatga aaaatacaaa gaagcgttgg aaataacagc 

16201 aacaaaattt ggtttaacgg tgaatgaaga tttgcagtta gtcggagaac ctaatgttgt 

16261 taaatcagct attgaagcag ctagagaatc cacaaaagaa caattacgtg actatgtaaa 

16321 aacatcggac tataaaacag acaaagacgg tattgttgaa cgttcagata ctgctgaagc 

16381 tgagagaacg actttaaaag gcgaaatcaa agataaagte acgttaaacg aatatcgaaa 

16441 cggatcggaa gaacaaaaac aatatactga tgaccagtta agtgatttgt ccaataaccc 

16501 tgagattaaa gcaagtactg aacaagcaaa ccaagaagcg caagaagctt taaaatcata 

16561 caccgatgcc caagatgatc ttaaagagaa ggaatcgcaa gcgtatgctg atggtaaaat 

16621 ttcggaagaa gagcaacgcg ctatacaaga tgctcaagct aaacttgaag aggcaaaaca 

16681 aaacgcagaa ccaaaggcta gaaacgctga aaagaaagct aatgcttata cagacaacaa 

16741 ggtcaaagaa agcacagatg cacagaggaa aacattgact cgctatggtt ctcaaattat 

16801 acaaaatggt aaggaaacca aattaagaac taccaaagaa gagtttaatg caaccaatcg 

16861 tacacttcca aatatattaa acgagattgt tcaaaatgtt acagatggaa caacaatcag 

16921 atatgatgat aacggagtgg ctcaagcttt gaatgtgggg ccacgtggta tcagactaaa 

16981 tgctgataaa actgatatta acggtaatag agaaataaac cttcttatcc aaaatatgcg 

17041 agataaagta gacaaaaccg atattgtcaa cagtcttaat ttatcaagag agggtcttga 

17101 tatcaatgtt aatagaattg gaattaaagg cggtgacaat aacagatatg ttcaaataca 

17161 gaatgactct attgaactag gtggtattgt gcaacgtact tggagaggga aacgttcaac 

17221 agacgacacc tttacgcgac tgaaagacgg tcacctaaga tttagaaaea acaccgctgg 

17281 cggtccacct catatgtcac attttggtat ttcgactcac attgatggtg aaggcgaaga 

17341 cggtggttca tctggtacga ttcaatggtg ggataaaact tacagtgata gcggcatgaa 

17401 tggtataaca atcaatccct atggtggtgt cgttgcacta acgtcagata ataatcgggt 

17461 tgttctggag tcttacgctt catcgaatat caaaagcaaa caggcaccgg tgtatttata 

17521 tccaaacaca gacaaagtgc ctggattaaa ccgatttgca ttcacgctgt ctaatgcaga 

17561 taatgcttat tcgagtgacg gttatattat gtttggttct gatgagaact atgattacgg 

17641 tgcgggtatc aggttttcta aagaaagaaa taaaggtctt gttcaaattg ttaatggacg 

17701 atatgcaaca ggtggagata caacaatcga agcagggtat ggcaaattta atatgctgaa 

17761 acgacgtgat ggtaataggt atattcatat acagagtaca gacctactgt ctgtaggttc 

17821 agatgatgca ggagatagga tagcctccaa ctcaatttat agacgtactc attcggccgc 

17881 agctaatttg catattactt ctgctggcac aattgggcgt tcgacaccag cgcgtaaata 

17941 caagttatct atcgaaaatc aatataacga cagagatgaa caactggaac attcaaaagc 

18001 tatccttaac ttacctatta gaacgcggtt cgataaagct gagtctgaaa ttttagctag 

18061 agagctgaga gaagatagaa aattatcgga agacacctat aaacttgata gatacgtagg 

18121 tttgattgct gaagaggtgg agaatttagg attaaaagag tttgtcacgc atgatgacaa 

18181 aggagaaact gaaggtatag cgcatgatcg tctatggatt catcttatcc ctgttaccaa 

18241 agaacaacaa ctaagaatca agaaactgga ggagtcaaag aatgcaggat aacaaacaag 

18301 gactacaagc taatcctgaa cacacaattc accattcacc acaggaaatt atgaggccaa 

18361 cacaagaaaa cgcgatgtta aaagcgtata tacaagaaaa taaagaaaat caacaatgtg 

18421 ctgaggaaga gtaatcctta gcactatttt tatacaaaaa ttcaaggagg tcatttaatt 

18481 acggcaaaag aaattatcaa caatacagaa aggtttatct tagtacaaat cgacaaagaa 

18541 ggtacagaac gtgtagtata ccaagatttc acaggaagtt ttacaacttc tgaaatggtt 

18601 aaccatgctc aagattttaa atctgaagaa aacgctaaga aaattgcgga gacgccaaat 

18661 ttgttatatc aattaactaa caaaaaacaa cgtgtgaaag tagtcaaaga agtagtcgaa 

18721 agaccagatt tatctccaga ggtaacagtt aacactgaaa cagtatgaaa agctatgagt 

18781 cagataccca eagtctteat tcttttagaa agcgggtgta ctgaattggg gtggttcaaa 

18841 aaacacgaac atgaatggcg catcagaagg ttagaagaga atgataaaac aatgctcagc 

18901 acactcaacg aaattaaatt aggtcaaaaa acccaagagc aagctaacat taaattagat 

18961 aaaaccttag atgctattca aaaagaaaga gaaatagatg aaaagaacaa gaaagaaaat 

19021 gataagaaca cacgtgatat gaaaatgtgg gtgcttggtt tagttgggac aatatttggg 

19081 tcgccaacta tagcattatt gcgtatgctt atgggcatac aagagaggcg actaccacgc 

19141 tcggattaaa ttttggagct tcgccgtgga cgtgtttctg gtttggtaag cgtaagtaat 

19201 agttaagagt cagtgcttcg gcactggctt tttattttgg ataaaaggag caaacaaatg 

19261 gatgcaaaag taataacaag acacaccgta ttgatcttag cactagtaaa tcaatcctta 

19321 gcgaacaaag gtattagccc aattccagta gacgatgaaa ctatatcatc aacaatactt 

19381 actgtagtcg ctttatatac aacgtataaa gacaatccaa catctcaaga aggtaaatgg 

19441 gcaaatcaaa aactaaagaa atataaagct gaaaataagt atagaaaagc aacagggcaa 

19501 gcgccaatta aagaagtaat gacacctacg aatatgaacg acacaaatga tttagggtag 

19561 gtggttgata tatgttaatg acaaaaaatc aagcagaaaa atggttcgac aaeecattag 

19621 ggaaacaatt caacccagat ggttggtatg gacttcagcg ttatgatcac gccaatatgt 

19681 tctttatgtt agcgacaggc gaaaggctgc aaggtttata cgcctataat atcccgtttg 

19741 acaacaaagc aaagaccgaa aaatatggtc aaacaactaa aaactatgac agcttcttac 

19801 cgcaaaagtt ggatattgtc gttttcccgt caaagtatgg tggcggagct ggacacgttg 

19861 aaatcgttga gagcgcaaat ttaaacactt tcacatcatt tggtcaaaac tggaacggta 
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19921 aaggtcggac taatggcgtt gcgcaacctg 

19981 ctcattatta tgacaatcca atgtattcca 

20041 ttggcaataa agctaaaggC attattaagc 

20101 aacctaaaaa aactatgctt gtagccggtc 

20161 acggaacaaa cgaacgcgat tttatacgta 

20221 taagacatgc aggaeatgaa gttgcateafc 

20281 atcaagatac tgcatacggc gccaacgtag 

20341 ttaaatcaca ggggtatgac attgttctag 

20401 caagtggtgg gcatgttatc atctcaagtc 

20461 tacaagatgt eaetaaaaat aacceaggac 

20521 tactaaatgt taatgtatca gcagaaacaa 

20581 tcattactaa taaaaatgat atggattgga 

20641 taaeagccgg tgcgattcat ggtaagccta 

20701 catxagctaa aaacaaaaaa aatccaccag 

20761 acgtccctta taaaaaagaa caaggcaact 

20821 taagagacgg ttattcaact aattcaagaa 

20881 ttacgtatga cggcgcatat cgcaccaatg 

20941 gtggacaacg tcgttatata gcgacaggag 

21001 gttttggtaa gtttagcacg atttagtatt 

21061 tatagggaat cttacagtta ttaaataact 

21121 tttttaacac ttctctcaag atttaaatgt 

21181 tattttttta tgttatagct agccttcggg 

21241 catcaactat ttacatctat ccttgttcac 

21301 gatagagagc atagttttca tactactccc 

21361 caacagttta cggggtgctt ttatgttata 

21421 tagccgggca gaggccatgt atctgactgt 

21481 cactcgatac atatatctta acaacataga 

21541 tcgatacggc catattcacc cccccacaac 

21601 attgtggtta ttttttgcgt ttttttgggg 

21661 caaacgcttg tggaaaagct aaaaggctaa 

21721 tttggacgct cgtgtacgtt agagaatgac 

21781 ceegtgttaa aaagccttta atatcagttg 

21841 aaaaaagggc agaaaaaggg cagatacctt 

21901 taactctctg tccattttct ctgttacatg 

21961 tgtatgtcct actcttttca taattgcttt 

22021 tatgtgtgta tgccttagtg tgtgagtagt 

22081 agctgaggac aatcgtttgt ttatcctact 

22141 gaatacaaac cctctatcaa catagcttgg 

22201 cattattttt ttcaatacat tcgccatcct 

22261 tgcggtctta gtagtatctt tgtgaccaaa 

22321 gccattaata gcgatcgttt tatttttgag 

22381 ctcacctacg cgcatacctg ttaaagcttg 

22441 agctctatac tgcacgttat catcgttcag 

22501 catctctaaa tagttataca ttttcgctcc 

22561 cttctttggt agtgtgacgc tatttaatac 

22621 ggcgtattta atagcttctt tcatatgtcc 

22681 tacgtttgat aatttgttaa taaatgtttg 

22741 taaattttga gaaccgctct tctcgatgtc 

22801 cgttacttta aagccagacg tttttatacg 

22861 aaaagtcaaa gtttttaatt cgctcgacga 

22921 ttctaaacga aacattgcct ccttttgcga 

22981 eacacgtttc catttatctg tatacggatc 

23041 ttcattgttc ttatttttaa atttttcaaa 

23101 aaaaaacaat aagggtaggc gggctaccca 

23161 aaaatacaga cgccacttat aattataaga 

23221 aatatatacg tgttttaaag gacaaacctt 

23281 agggatctgc aatatattat tattaattct 

23341 taetactgga tttttaattt tetggggtaa 

23401 ccggaaagaa tttatgcaag cgtaactatt 

23461 tgatactatg ttattaatgt ttctgtcaat 

23521 atcagatata aattcaataa aacaatctct 

23581 ttttttatcg aaaacttctt ttaatatagc 

23641 aaacaatctt aaataacact cccacttcaa 

23701 ttctttagag gacaagggaa taacatttac 

23761 catcactatt gcaaagtgtg aactagaaaa 

23821 aaaaactacc tctccttgtt caaactttgg 

23881 aaatctcttg agtaaacagt gaacacctga 

23941 agtttttaat ttattaatgc gtttttctat 
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gttggggtcc tgaaactgcg acaagacacg 
ecaggttaaa cctccctaac aacttaagcg 
aagcgactac aaaaaaagag gcagtaatta 
atggttataa cgatcctgga gcagtaggaa 
aatatataac gcctaatatc gctaagtatt 
acggtggccc aagccaatca caagatacgt 
gcaataaaaa agattatggc ttatatcggg 
aaatacattt agacgcagca ggagaaagcg 
aattcaatgc agatactact gacaaaagca 
aaataagagg tgtgacacct cgtaatgatt 
atataaatta tcgtttatct gaattaggtt 
ctaagaaaaa ctatgacttg taetctaaat 
taggcggttt ggtagctggt aacgttaaaa 
tgccagcagg ttatacactc gataagaata 
acacagtagc taatgttaaa ggcaataatg 
ttacaggggt actacccaac aacacaacaa 
gctacagatg gatcacttac attgctaata 
aggtagacaa ggcaggtaat agaataagta 
tacttagaat aaaaattttg ctacattaat 
attcggatgg atgttaatat tcctatacac 
agataacagg caggtacttc ggtacttgcc 
ctagcttttt gttatgatgt gttacacatg 
ccaagcatgt cactggatgt tttttcttgc 
cgtagtatat atgactttag cattcccgta 
attgctttta tatageagga gtgaactata 
tggtcccaca ggagacatct tcctcgtcat 
aatgttacat tcgctataac cgtatcttaa 
caacaaaacc acagatccea ttaatttagg 
caaaaaaagg gcagattatt tgaaaaaggg 
aaacgacaaa aaccttgata caacagtgtt 
cggtttacca ccatacaagg gtgggattaa 
ttacaaagga tttgtagcgt ctttaaaaat 
ttagtacaca agttttccta atttttgctc 
tgtatacacc tttatagtcg ttttttcacc 
taacgatata ttcatetccg ccaaeaaact 
aactttttta tttatattta acgattctgc 
gccttgcata ggatttcctt ggcaagttgt 
ttcccactgt tgcatctttt tattttctaa 
tgaatcgatg gcgatttttc tecttgaacc 
cccagcatta catttgattc tgtgaatagt 
gtcaacatct ttaacttgga gagctaataa 
aacttctaca gccccagcaa ctaaaatacg 
tataaaaccg cgtatctgta ttacctgttc 
ctctttttct atatcttcta tcgtcttact 
gtgttcgttt ggataattgc aaaattcaac 
aagttgacgc tttacctgat tcgcagaata 
catgtacttt gtatcaattt tgtttaaaag 
tttgattcct gctttcaaat tatcaagcgt 
atattcaagc cattcatcta ataacgcgtg 
ctcgtcgtct agtttttctt ccatttctcc 
ttgctttgta ttcttattca agacaacact 
tttgtatttc tcgtagtatc tatacttcgt 
ccacatttta catccctcct caaaattggc 
tgaaaatcgt aeaaaaaaag acgcccgtac 
ttacatggtt aattaccaaa aatggtaacg 
taatatacta aaaccatatc atcttatatc 
atttatcagt aacacaatac ccgaagaatc 
aacttttctt atgcgaaact tactaatcgg 
accttttaac tcttttacct tatcaattgc 
tttatctaat ttattttcaa tttctaaact 
agcgacgaat ectgtgttgt ttttttggta 
cgaactattc tgcgcgctaa ttaaatctaa. 
atcaaaattc atctttaaac actttttgtt 
tataccctcc gcactagaac catttttact 
tcctttatta acgtttatac cgaaatctac 
acaaaaacct ttatggtttt tttcaccttc 
acctaacttt ttaaatcttg gatttccaga 
attatgcgtc atcatttctc ctttattctc 
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24001 gctcacactc ccaccaccat tcaacgtcta cacttgtagg cgttttttga ttagtaaaat 

24051 caeaatgaat cccctttggc taacttatcg ccatctattt tttgtgaaat aaattccaag 

24121 tatttacgcg cattatgtga cgataaatct ctaggcaace cataagegaa tggetgatta 

24181 ccactagtta aaacttcata tactatagtt tcttttttca ttctgcaatt agttattttc 

24241 ateataaace ccttctaaac accgccgaaa tagacgtctc tttcaaataa gcacgattaa 

24301 cactttaatc ctttaatcca catatattta aaagtgaggt agtaggtaat aaatacaaga 

24361 ctcaaagcta agattgcttt ttccatgtca attcctcctt cgtttatatt catattaaag 

24421 cgctaaatat acgccattaa tcacaacaca acettgccca ttaccttaat atcactaaac 

244 81 gaagcgactt tgatatcacc atacttcgga tttagagata ccaaattaat atagccttcg 

24S41 catatatcca cacgcttgat aagacttact ccatctaaca caacgagtgc aattgcacca 

24601 tctttaatag aatcttcctt cecaataaaa gcgeaegttc cttgctteaa cataggttcc 

24661 attgaatcac cattaactaa aatacaaaaa tcagcatttg atggcgtttc gtcttcttta 

24 721 aaaaatactt cttcatgcaa catgtcatca tataattctt ctcctatgcc agcaccagtt 

24781 gcaccacatg caatatacga tactagttta gactcctcat attcacctae agaagtgace 

24841 ttaccctgtt catctaattg ctcatttgca tagttaagta cgttttcttg gcggggaggc 

24901 gtgagttgag aaaatatget attgattttt gacattaccg tcccatcttg acgttcttcg 

24961 tcaggaactc gataagaatc tacatcatac cccataagcc acgcttcacc gacacctaaa 

25021 gttttagata ataagaataa tttatgetgg tctggagaag accttccatt aacatactgg 

25081 gacaagtgac ttettgacat tteaatattc aattcttttt gaaagggttt cgacttttct 

25141 agaatatcta cttgacgcaa gttcctatct ttcataattt gttttaatct ttcagaagtg 

25201 ctttgcattg gtaatgcctc cttgaaattc attatatagg aagggaaata aaaatcaata 

25261 caaaagttca actttttcaa ctetttgtgt egacattgtt caaaattggg gttatagtta 

25321 ttatagttca aatgtttgaa cttaggaggt gattatttga atactaatac aacttttgat 

25381 ctttcgttat tgaacggtaa gatagtcgaa gtgtactcga cacaatttaa ctttgctata 

25441 gctttaggtg taccagaaag aactttgtct ctgaagttga acaacaaagt accatggaaa 

25501 acaacagaca ttattaaagc ttgtaagtta ttgggaaeac ctacaaaaga tgttcacaaa 

25561 tatttcttta aacagaaagt tcaaatgttt gaacttaata agtaaaggag gcataacaca 

25621 egcaagaacg agaaaaggtt aacaaaagta acacatcctc aaatgaagca tcaaaacctc 

256B1 ttaggacaaa ttgaagctca cgacaaaacg cttaaagaaa taaagtacae ecgagacctt 

25741 tacaacaaac acctaagcat gaacaacgaa gacgcattcg ctggtttgga aatggtagag 

25801 gatgaaatta ctaaaaagct acgaagtgct atcaaagagt tccaaaaagc agtgaaagcg 

25861 ttagacaagc ttaacggtgt tgaaagcgac aacaaagtta ctgatttaac agagtggcgg 

25921 aaagtgaacc agtaacattc acttectaac ataaccacgc ttatcaacat ccacattgag 

25981 cagatgtgag cgagagctgg cgatgatatg agccgcgttt aaatacattc gatagtcatt 

26041 gcgataaccg tccgctgaat gtgggtgttg aggaaaaagg aggatactca aatgcaagca 

26101 ttacaaacat teaattttaa agagctacca gtaagaacag tagaaattga aaacgaacct 

26161 tattttgtag gaaaagatat tgctgagatt ttaggatatg caagatcaaa caatgccatt 

26221 agaaatcatg ctgatagcga ggacaagctg acgcaccaat ttagtgcatc aggtcaaaac 

26281 agaaatatga tcattaccaa cgaaccagga ttatacagtc caaccttcga cgcctctaaa 

26341 caaagcaaaa acgaaaaaat tagagaaacc gctagaaaat tcaaacgctg ggtaacatca 

26401 gatgtcctac cagctactcg caaacacggt atatacgcaa cagacaatgt aattgaacaa 

26461 acattaaaag atccagacta catcattaca gtgcegactg agtataagaa agaaaaagag 

26521 caaaactcac ttttacaaca gcaagcagaa gttaacaaac caaaagtatt actcgctgac 

26581 tcggtagccg gtagtgataa ttcaatactt gtcggagaac cagcgaaaat acttaaacaa 

26641 aacggtgttg acataggaca aaacagattg ttcaaatggt taagaaataa tggacacccc 

26701 attaaaaaga gtggagaaag ttataactta ccaactcaaa agagtatgga tctaaaaatc 

26761 ttggacatca aaaaacgaac aaetaataat ccagatggtt caagcaaagt atcacgtaca 

26821 ccaaaagtaa caggcaaagg acaacaatac tttgttaata agtttttagg agaaaaacaa 

26881 acatcttaaa aggaggaaca caatggaaca aaccacatta accaaagaag agttgaaaga 

26941 aattacagca aaagaagtca gagaggctat aaatggcaag aaaccaatca gttcaggttc 

27001 aattttcagc aaagtaagaa tcaataatga cgatttagaa gaaatcaata aaaaactcaa 

27061 tttcgcaaaa gatttgtcgc taggaagatt gaggaagctc aatcacccga ttccgctaaa 

27121 aaagtaccag catggcttcg aatcaaetca ccaaaaagcc tatgtacaag atgttcatga 

27181 ccatattaga aaattaacat tatcaatttt tggagtgaca cttaattcag acttgagtga 

27241 aagtgaacac aacctagcag caaaagttta tcgagaaatc aaaaactatt atttatacat 

27301 ctatgaaaag agagtttcag aattaaccat cgacgacctc gaataaagga ggaacaacaa 

27361 atgttacaaa aatttagaat tgcgaaagaa aaaaataaat taaaactcaa attacccaag 

27421 catgccagtt actgcctaga aagaaacaac aaccccgaac tgctgcgagc agttgcagag 

27481 ttgttgaaaa aggctagcta aattcaacgg taaggatttg ccccgcctcc acacttagag 

27S41 cttgagatcc aacaaacaca taagtcttag tagggtctag aaaaaatgtc ccgatttcct 

27601 ccctcgtaac agctccaatt cctccatatc ccggaaaaac aacttccttt aaatccgaaa 

27661 catgtttttt tgaaccatcc tttaaagtaa ctagaagttt catacttatc acctccttag. 

27721 gttgacaaca acattataca cgaaaggagc ataaacaaca tgcaagcatt acaaacaaat 

-27761 tcgaacatcg gagaaatgct caatacccaa gaaaaagaaa atggagaaat cgcaatcagc 

27841 ggtcgagaac ttcatcaagc attagaagtt aagacagcat ataaagattg gtttccaaga 

27901 atgctttaaat acggatttga agaaaacaca gactacacag ctatcgctca aaaaagagca 

27961 acagctcaag gcaatatgac ccactacact gaccacgcac tcacactaga caccgeaaaa 

28021 gaaatcgcaa tgattcaacg tagtgaacct ggcaaacgtg caagacaata tttcatccaa 
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28081 gctgaaaaag catggaacag cccagaaacg 

28141 aacacaatca atcaattaga aacaaagatt 

28201 gacgcagtag ceactactaa gacatcaact 

28261 caaaacggca taaacatcgg gcaacgcaga 

28321 cttattaaac gcaagggtgt ggattataac 

28381 ttattcgaaa ctaaagaaac atcaatcaca 

28441 acgccaaaag taacaggtaa aggacaacaa 

28501 caaacaactt aataggagga attacaaatg 

28561 acaatggcag ttgtgacgtg gaaggtttgg 

28621 attagtagca gggcgttgag tgactatcta 

28681 gctgaaaatt ceactgaatc tgctcgtcgc 

28741 aaataacaac attatacacg aaaggaaaga 

28801 caccagaaaa cacatataga ggcgaagaaa 

28861 cacaaaccca tcaattgtet ggagtacgca 

28921 accgcaaaga caatttaggt gtagaaaatt 

28981 tgattaatat ttctaaattg gaagagtacc 

29041 aggatactaa acgagcaaca tttataaaag 

29101 cttagcgatt gtacttatgc cgtttctata 

29161 cgcaagtatc gcaacattca tgtactacaa 

29221 ctacttgttg gagcaagtaa cagtatcaaa 

29281 aacgaaaaac ggaggaagtc aagatgtatt 

29341 ttcatgttaa cggattcgat tttaagctat 

29401 tacaagttaa agatacgaac aacgtaccaa 

29461 acttagatat ggcatcagac etatttaacc 

29521 cagacgaaca ggacagacta attaacttag 

29581 agactgtaac ttatatcatt cgtcacaggg 

29641 ctgataacaa ttcagacatt agttactcca 

29701 gtatggaaga agcgagtatc aatatggatt 

29761 aaactattga gtacgaggag gtagaacatg 

29821 gtaagcatac tcaaaaaact aaagataaat 

29881 tataaattcg cagtatacgg aaaaattggc 

29941 aaagacgctt tcgtcattga cattaacgaa 

30001 gacgtagaaa tcgagaacta ccaacacttt 

30061 ttacaggaga tgagagaaaa cggacaagaa 

30121 aaacttagag atatgacatt gaatgatgtg 

30181 aatgattggg gagaagttgc tgaacgaatt 

30241 caagaagaat acaaattcca ctttgttatt 

30301 gatgatgaag gtagcactat caaccctact 

30361 aaagctatta cttctcaaag tgatgtgtta 

30421 aacggagaaa agaaagctag atatattcta 

30481 aagattagac attcaccttc aataacaatt 

30541 acggacgtag tagaagcaat tagaaatgga 

30601 aattatgaaa atcacaggac aagcgcaatc 

30661 taacggctca gcagggtctc aagctggaga 

30721 caatgataga gaaaatagat atttcacaat 

30781 taaacataae caatctgtac cgccgtataa 

30841 aetagetace cgactaggta tcaagccaaa 

30901 tcttattggt aagtcttgtc acttggtatc 

30961 gtattttacg gatttttcat ttattaaacc 

31021 acctattccg aagacagata agcaaaaagc 

31081 accaatgtct caacaaagca atccatttga 

31141 agatttagcg ttttaaggtg tggtttaaat 

31201 cgacggcacc tattccgtcg ttgctactgg 

31261 actagaaaac ggatatccac taaaagcaga 

31321 tatagaacaa cgcaaaaaaa tattcgcaat 

31381 accagtagaa tcaactagaa aatcattaca 

31441 agaaatcagt ctgcgcgact gttctacgaa 

31501 agcgeecacg ttccaccacc aaaeacccat 

31561 agataaagcg ttattatatt gggctacaat 

31621 tcacgcagac ctggcacatt atgaagcagt 

31681 ccactatgac aaacatgtat tagcgttatg 

31741 tggcgctaag tcgttcgatg ataaatacca 

31801 gaggctcaat aaaacgttga aaggagagaa 

31861 atagcacccc caatcgccat cttggcggaa 

31921 gcggagaaaa ctctaaaatc tccgtttagt 

31981 aggcggacaa actaattgag ccttttttga 

32041 ttaatacttc aaatccaatg ccagaaagtt 

32101 ttaacattct tttaacaaat tctaatcccg 
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actatgcaac gtgctttaaa aattgctaac 
gcacgtgaca aaccaaaaat cgtactcgca 
ttagttggag agttagcaaa gatcactaaa 
ttgtttgagt ggtcacgcca aaacggattc 
atgcctacac agtatccaat ggaacgcgag 
cattcggacg gtcacacacc aaccagcaag 
tactttgtta acaagttttt aggagaaaaa 
aacgcactat acaaaacaac cctcctcatc 
aagattgaga agcacactag aaaacctgtg 
aacaacaaat ctttaaccat accgaaagat 
cttttgaagt ccgccgaaca aactattagc 
tagaaatgcc aaaaatcata gtaccaccaa 
aatttgtgaa aaagttatac gcaacaccta 
gaagtacagt acacaactgg ttgaaaeatt 
catacactga ttattcacca acaggcactc 
cgatcagaaa gcacaaaaaa eggtattagg 
ctacctagta gcagtattat gcttcacagt 
cctcaccaca gcatggtcaa ttgcgggatt 
agaatgcttt ctcaaagaac aaaaaaactg 
cacttaagaa aaaattcatg ttcaatataa 
acgaaatagg cgaaatcata cgcaaaaata 
tcactttaaa aggtcatatg ggcataccaa 
ttaaacatgc ttatgtcgta gatgagaatg 
aagcaacaga cgaatggatt gaagagaaca 
ecaegaaaeg gtaggaggtc gctatgaagc 
atacgccaat ttatataact aacaaaccaa 
caaatagaaa tagagctagg gagtttaacg 
accacaaagc aatcaagaaa acagtgacag 
actgaggaaa aacaagaacc acaagaaaaa 
aatatcgccg agaaaaataa aaggaaattc 
tcaggaaaaa ccacgtttgc tacaagagat 
ggtggaacaa cggttactga cgaaggatca 
gtttatgttg taaatttttt acctcaaatt 
atcaatgttg tagttattga aactattcaa 
atgaaaaata agtctaaaaa accaacgttt 
gecageatgt acagatcaat aggaaaactt 
acaggtcatg aaggtatcaa caaagacaaa 
atcactattg aagcgcaaga acaaattaaa 
gctagggcaa tgattgaaga atttgatgat 
aacgctgaac cttctaatac gtttgaaaca 
aacaacaaga aatttgcaaa tcccagcatt 
aaccaaaaat eaattaaaag gacggtattt 
tactaaagaa acaaaccaag aaaagtttta 
actcacagtg aaagttaaaa atattgaact 
cgtatttgaa aatgacgaag gcaaacaata 
atatgatttc caagaaaaac aattgattga 
cccccctagc ctagaccttg acaccaatga 
gaaatggaaa ttcaatgaag atgaaggtaa 
tcacaaaaag ggcgatgatg ttgttaacaa 
tgaagaaaat aacggggcac aacaacaaac 
aagcagtggc caatttggat atgacgacca 
gcaatacatt acaagatacc agaaagataa 
tgttgaactt gaacaaagtc acattgactt 
agtagaggtt ccggacaata aaaaactatc 
gtgcagagat atagaacttc actggggcga 
aacagaattg gaaattatga aaggttatga 
agttgcaagg gagctaatag aactgattat 
gagfcgtagaa acgagtaagc cgctaagcga 
caaccgcaac tgtgtaatat gcggaaagcc 
cggcagaggc acgaacagaa acaaaatgaa 
tcgcgaacat cacaacgagc aacatgcgat 
cttgcatgac tcgcggataa aagttgatga 
aaaggaatga acagactaag aataacaaaa 
gagattagaa atgctatgca tgctgtaaaa 
taatacaggc tettacaaaa gctttaccae 
tgtctattac ccaggggctg taatgtaact 
tacttactgt ttctaggttg cgtcctgact 
aaacaaacct ttgtttttct ataatcttat 
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32161 taaagtgatt taaaaactga ggagcacaaa 

32221 aagacatgcc aaaagtttca tttaaaaccc 

32281 cggctgattc tatatctaac ggagagcctt 

32341 cattctttgg gtttaaaacc gccctatatt 

32401 aatgttctaa aagaatagca tcatttgggg 

32461 ggtgggtcaa tgagtccttt ctgtcatcca 

32521 tacttaaagt tttttcacta atgtaaaact 

32581 aaaattgtgg ttcttgtaaa tcatttttag 

32641 ctctgaattt ttcaaattct acttctcttt 

32701 acttcccaaa gacaagtccc caagttttag 

32761 cttcaataat tttatcaata cctttaccta 

32821 ctaacgcaat agcgataata aaattatacc 

32881 aagttactac tcaataatta cagcaaacgt 

32941 aaagttactt tttgcagaaa taacatcttt 

33001 caatggttac tttgcaactt tatacaacgt 

33061 gaaccttacc aactttggtt atctaaaaat 

33121 acaaaggaag atgtacccct tgacgcaaac 

33181 ccctattgat aattctgtca atacccctat 

33241 tattaataat acaagcaaca acaacataaa 

33301 agcatcttcc ataccctata aagaaattat 

33361 tttcaaacac aatacagcta aaacaaaaga 

33421 taggttggag gattttaaaa aggtgattga 

33481 tagcgataaa taccttagac cagaaacact 

33541 tcaaaaaata caaccaactg gcacggatca 

33601 ttgggattag ggggatatta tgaaaccact 

33661 aaaatatcaa cccacccatg ccgaaaaagg 

33721 cgacttatat aagtttgcec ctactaaaaa 

33781 ttgcaaatgt gaaatctacg aggaatataa 

33841 atccaatcaa tcaaacgcta atccgccttt 

33901 acaaaatgaa aaacaagtac acgctaaaca 

33961 tacaaaagaa ccaaaatcat taatattgca 

34021 agcatacgct atcgcaaaag cagtcaaagc 

34081 accaatgttg atggatcgta tcaaagcgac 

34141 cgagctagtc agattgctaa gtgatattga 

34201 aaacacagag cacactttaa ataaactttt 

34261 caacatcttt acaactaact ttagtgataa 

34321 tataaattcg agaatgaaaa aaagagcaag 

34381 ggagcgagat gcatggtaac caaagaattt 

34441 tacgcteaga aactcataga tgaggcacag 

34501 atccaaaaac ttgcagaacg tcatacacgc 

34561 aaaatgccga aagaaaaata ttacttatac 

34621 atcaagtata aagacaacgt aaatgaggtt 

34681 gaaaagaaaa ttatgactga tagtgaccta 

34741 tatgagcaag aattaggttt acaagcaacg 

34801 aaatacaacg ctaagaaagt tgagtacaaa 

34861 gaacattacc aatatttaga aagtaatatg 

34921 caaccgaaat tcgaattatt accaaaacta 

34981 gacttcgcgt tatatctcga tggcaaactg 

35041 accgaagtag caaaacttaa agctaagatt 

351Q1 aattggatat gtaaagcgcc taagtataca 

35161 attaaagcaa gacgagaacg caaaagagaa 

35221 tataaatgca acgattgata taaggatacc 

35281 tgtggataaa gaaaaagaag cgccggcaga 

35341 agagtatgac aatctaaaaa ttagaaacgt 

35401 tgtaatcatt aataataaac cataeaaact 

35461 agcgtgggat aaatgccgga attgtetcta 

35521 aagctttaga cgcgccttat ggcatgcacc 

35581 aaaagattaa acaagcgaga ctcgaacgtg 

35641 agctacgtaa gaagaagcca caettgttta 

35701 actggttcga tgtcacttat aaccaaatgt 

35761 aatcagtaac agaaaagtag atatgaacaa 

35821 ttacacatac ggcgacattg aaateataga 

35881 accacaatta gcattcgcaa taggtaatgc 

35941 gaatggtcac gaggattcag caaaggcgaa 

36001 ggagtgatga ccacgacaga tagcggacgc 

36061 aagagatatc tgtatcagga caacgaacga 

36121 tattactttc acggtcatac cgtgccaggt 

36181 gcggaagagc ttgaaacaca tacaaagcaa 
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actcaccata aactccctcc tttgttaagt 
ctaacctcac taggctatta attgaaattt 
ttattaacgt gtccgatata ttcataccgt 
taacggcagg acgtactccg tgattcttta 
ataattgttt aattatttca acaaatgaat 
cagatgacgc tactagcttt gcgaacatat 
tcgaagcttc tagagcagga cctagaagag 
gtacagaaga tatttctttt teaaattgtt 
gataaacaac tttatccaca taaaggtgga 
agaatgtctc cacaggccct ttcgatgcgc 
aaataggacc cataattatt cacccccaat 
agaaaggaga atcaacatga ccgaccaacc 
cagatacgat aaccgactta ccgacagcga 
aagtaacaaa tacggatact gcacagcaag 
tgttaaggaa actatatctc gcagaatttc 
cgaaactacc aaagaaggta atgaagctaa 
gtcaatacct attgacgcaa aaatcaatac 
tgacgcaaat gtcaaagaga atactacaag 
tagaatagat atattgtcgg gcaacccgac 
cgatcactta aacaaaaaag cgggcaagca 
ttttattaaa gcaagatgga atcaagattt 
tatcaaaaca gctgagcggc taaacacgga 
ttttggcagt aaacttgagg ggtacctcaa 
atcggaacgc atgaagtacg acgaaagtta 
attcagcgaa aagataaacg aaagcttgaa 
attgaaatgt gagagatgtg gaagtgaaca 
acacccgaat ggttacgagt ataaagacgg 
gcgaaacaag caacggaaga taaacaacat 
aagagacgca acagtcaaaa actacaagcc 
aacagcaata gagcacgtac aaggcttctc 
aggttcatac ggaactggta aaagccacct 
taaagggcat acggttgctt ttatgcacac 
atacaacaaa aatgcagtag agactacaga 
tttacttgta ctagatgata tgggtgtaga 
cagcattgtt gataacagag taggtaaaaa 
agaactaaat caaaatatga actggcaacg 
aaaagtaaga gtaatcggag acgatttcag 
ttaaaaacta aacttgagtg tccagataeg 
ggcgatgaaa ataggttgta cgacctattt 
cccgctatcg tcgaatacta aggagtgtta 
cgagaagacg gcacagaaga tattaaggcc 
tattcgctca caggagccca tttcagcgac 
aaacgattca aaggcgctca cgggcttcta 
atatttgata tttagaggtg gacgatgagt 
ggaactgtat ttgatagcaa agtagagtgt 
aatggcacta attatgatca tatcgaaata 
gataaacaac gaaagactga atatattgca 
attgaagtta tcgacattaa aggtatgcca 
ttcagacata aatacagaaa cataaaactc 
ggtaaaacat ggaccacgca cgaggaacta 
atgaagtgat ctaatgcaac aacaagcata 
tacagaagtc gaatatcagc attttgatga 
ttacetaeat aacaatcccg acgaaatact 
aaatgtagag gtggaataaa cgggcagtgt 
taacaatttt gaaaaaagaa ataacggcaa 
aacgtgttag aggttgttgg gagctttcag 
taaaagaata tagagaaatg aaacaaatgg 
aactggaaag agagcgaaag aaagaggctg 
aegtacceca aaaacateca cgtgacccgt 
tcaagaaacg gagtgaagca taatgagcat 
aacgcaagac aacgttaagc aacctgcgca 
tteeaetgaa caagccacgg cacagtaccc. 
aatcaaatac ttgtctagag caccgttaaa 
gttctacgtc gatagagtat ttgacttgtg 
aaagaacacc taaaacatct tttcggctce 
gtggcacata tccatgtagt aaatggcact 
tggcaaggtg cgaaaaagac atttgacaca 
agcgatttgg aatatgagga acagaagcaa 
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36241 ccaactttac tttaaaaggg cggaaacaat 

36301 acctgaactt atccaatggg cttgggataa 

36361 tccaaacgac gttgagcgca actgctttgt 

36421 tgtgactgga tatgtatcaa ctaacgacaa 

36481 aaaaccaaag ttaaaaaaga aatgagatta 

36541 ccggatctac cacaaggaaa aatattttet 

36601 cctcacccaa acacaaataa gtgttcgacg 

36661 atagttgata ttgaaaaaga agtaacggaa 

36721 ttcgagattc aagaaggaga ctataactct 

36781 tgtetacatg gcagaegtgt gcctaccaaa 

36841 atgacgctaa tctggaaaga tggggagttg 

36901 ataaagataa aaaagttatg agtattattg 

36961 tgatttcaac aggttataaa agtttcaatg 

37021 aagatgtgca cggtgtggag atttatgaag 

37081 aagtaagttt tatcgagttt aaagaaggag 

37141 aactactaag tgaaaatgac gatattattg 

37201 tgctattgga ggttatgaga tgacgttcac 

37261 cactaactct aacaagttat tagataaact 

37321 caagaagcaa cgagatgagc teattgggga 

37381 tctagagaag aaagcaagcg catgggatag 

37441 aaacgaattc ggtaacgatg atgaaagagt 

37501 etctatggag gatgacacaa atgaataatc 

37561 ctagtgcgta taacggtaat gacacagagg 

37621 agaaagcgca agcgtttgat gaaatacttg 

37681 ttaaagaagg tattgaactt gatgaagcag 

37741 aatacgagga ggaataggaa aatgactaac 

37801 gctagaatgc ccgaacgaaa tcataagacg 

37861 acegtcgtac tcgaaccaca agaaaaagca 

37921 ccagagggct atgtcggact attaactagt 

37981 gtgactgaaa caggcaagat agacgcggga 

38041 aatgatgaag aacgtgatgg aatacccttt 

38101 gatggattaa taagcatttt agatataaaa 

38161 agaagagttt accaaaccaa caaaggcgat 

38221 tggacaccgg aactaaagca agtggaggaa 

38281 ggcttcggaa gtagcggagt gtaaagacat 

38341 gtgacgcaat acttagtcac aacattcaaa 

38401 actgtggcta gagataatca gacgtttaca 

38461 aaagagaagt acgaggcaca agttaaaaga 

38521 gaaaatataa gggagtgtgg gaaacgacgg 

38581 ctcaccttgc aaaaacagct gaaccttttg 

38641 atggttatat ttacgcaagt accataatca 

38701 ctgaatcaac cacacttatt gaggagcatg 

38761 tgacggtagt agtttgatat tacatgaaga 

38821 ggacaatttt agaaatgatg atgactattt 

38B81 tgtattgaac aaaggttata cagttgggac 

38941 tacctaaaat gaaattcccg aaaaagtaca 

39001 cacctgaaga aaaggctaag attgaagatg 

39061 gtgaatttta cagtcctacg atggctaata 

39121 gaacgacgcc tagtttaatt gatactggag 

39181 atggatgggt tcgacatctt tattgttgga 

39241 ccactcgtta ccacattgcc tatctaeaca 

39301 caaggaacta ttacagataa atataacaag 

39361 gtattagaca acaaacaagt cattgaaaat 

39421 agcgcagata tacaagctag gttaaaagta 

39481 tatagaatac actttttaaa tttatatccg 

39541 caatgatcaa acaaatacta agactateat 

39601 atgtaactga gcaagtgtat attatgatga 

39661 attacgtctt tcgagcggag gtgagtgaat 

39721 ttcgctgttt gctttcttaa tatccatata 

39781 attaggaatt tttggtatgt ataaaattat 

39841 gtagataaaa atgaaegagc aaataatagg 

39901 gctttattca gttaaagaga tttctaggta 

39961 aatcaattta gaacaaatat atccgatata 

40021 gattggagct tatattatcc caacagaaca 

40081 agtctttaat aatttagata agcaaagtaa 

40141 acaaatgact aatttatcaa atagagttaa 

40201 caatgaactt agtacaaatc agattttttt 

40261 tattataaat gaatatcaaa aagatatatc 



150 

gaaaaccaaa attgaaaaag aaatgaactt 
ccccaagtta ccaggcaata aaagattcta 
gacttttcat gttgatagca ccttacgtaa 
atttactgtt caagaggaga tataacaatg 
gatgaattaa ctaaatgggc gcgagaaaat 
ccaacaggac ctagtgatgg actcgttcgt 
ccaagcttta ttccaattga catccccctc 
gagactaagg ttgataggtt gattgaatta 
acactatatg agaacactag cacaaaagaa 
gcaccccaca tcttaaacga tgacccaact 
ctagtatgat gttgaaactt aaagcttggg 
acgaaatcga ttttaatagt gggtacattt 
aagtaaaact attacaatac acaggattta 
gggacattgt tcaagattgt tactcgagag 
ccttttatat aacttttagc aatgtaactg 
aaattgttgg aaatattttt gaaaatgaga 
cttatcagae gaacaatata aaaatctttg 
tcacaaagca ttaaaagatc gtgaagagta 
eatagcgaag ttacgagatt gtaacaaaga 
gtattgcaag agcgttgaaa aagatttaat 
taaattcgga atggaattaa acaataaaat 
gcgaaaaaat cgaacagtcc gttattagtg 
ggctgctaaa agagattgag gacgtgtata 
agggaatgac aaatgctatt caacattcag 
tagggattat ggcaggtcaa gttgtctata 
acattacaag taaaactatt atcaaaaaat 
gatgcaggct atgacatatt ctcagctgaa 
gtgatcaaaa cagatgtagc tgtgagtata 
cgtagtggtg taagtagtaa aacgtattta 
tatcatggca atttagggat taatatcaag 
ttatatgatg atatagacgc tgaactagaa 
ggtaaccatg tacaagatgg aagaggcata 
aaactagccc aattggttat cgtgcctata 
ttcgaaagtg tttcagaacg tggagcaaaa 
cttagaccga gctaaggagg ccttggggaa 
gattcaacag gacgaccaca tgaacatatt 
gtcattgagg cagagagtaa agaagaagcg 
gatgcagtta ttaaagtggg tcagttgtat 
atgttaaaat taaaactatt tcaggcggag 
aaaaacatgt tgaaagaatg acgagtttta 
agaaaccaac gtatattaaa acagatacga 
ggaaatgaat cagctgagaa ttttattaca 
tgaattattt aacgaaatag tatttgtttt 
aacgatagaa aaagattatg gcagagaact 
caacgttgag gaggcagatg atgactaaca 
ctgaaataat caaaaaatat aaaaataaag 
aetttattaa agaaattaaa gataaagaca 
tgaatgaata tgaattaagg gctaegctaa 
atgacaatga cgatcaaaaa acttaaaaat 
atactgtcat tattcggcat attcgcattg 
gtggccagcc accaacacaa agaattacat 
agacaagaca aagaagacaa gttctatatt 
tccgacttat cattcaaaaa gaaatttgat 
ggcgataagg eagaagttaa aacaatcggt 
gtcttatacg aagtaaagaa ggtagataaa 
tcttactagc aatgtatgag ctaggtaagt 
cggctaatga cgatgtagag gcgccgagtg 
aatgagaata ctcacccatg atctgatcgc 
tattattgat gatggagtga taataaatgc 
agatccctct ccagaaaaca ttataaagag 
aagcatatat actttagcag gaggtgttgt 
CCttacagac tctaacttac aacgtaaaaa. 
tttagattgt tttaaaaagg ctaaaaagat 
gcatgaactt ttagactttt ttgacattga 
aaaagcgtat gaaaatgtta ttggatctag 
ggcaacggaa gattttaaga tgagtttcaa 
taatccttct tctgttatgg aaacaattgc 
ttatctaaaa aatataacca ataaaatgaa 
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40321 tgaaaataga gcttacaacc atattgacag ctttatcact tcagagtacc gacgaaaaat 

40381 aaacgatcat aatctttatc ttgataaatt egaagaacag cttagtcaaa agtttaaaat 

40441 aaacagaact tcgataaaag aaagaattat tattaattta aacaagagga gatttaaatg 

40501 atgtggatta ctatgactat cgtatttgct acattgctae tagtttgtac cagtattaat 

40561 agtgatcgtg caagagagat acaagcactt agatatatga atgattatct acttgatgaa 

40621 gtagttaaaa ctaaagggta caacgggtta gaagaataca ggattgaatt gaagcgaatg 

40681 aacaacgata ttaaaaagta atttatacta tcggaggtat tgcattgaat gataaagatt 

40741 gagaaacacg atatcaaaaa gcttgaagaa tacattcagc acatcgataa ctatcgaaga 

40801 gagttgaaga tgcgagaata tgaattactt gaaagccatg aaccagataa tgcgggagct 

40861 ggcaaaagta atttgccggg caacccgatc gaacgatgcg caataaagaa gcctagtgat 

40921 aacaggtaca atacattaag aaatatagct aacggtgtag atagatcgat aggtgaaagt 

40981 gatgaggata cgcttgagtt attaaggttt agatattggg attgtcctat tggttgttat 

41041 gaatgggaag atatagcaca ttactttggt acaagtaaga caagtacatt acgtagaagg 

41101 aacgcactga tcgataagtt agcaaagtat attggttatg tgtagcggac ttttacccta 

41161 tgeaagtccg cattaaaaca gtttattatg ttagtatcag attaatattr aaagttatta 

41221 aatgctaata cgacgcatga acaagaggcg catcactatg tgatgtgtcz ttttatttat 

41281 gaggtatgaa catgttcaaa ctaattgtaa atacattact acacaccaag tatagacgag 

41341 tcttgatact acttaagtta tataaggtga aacattatga tgactaaaga cgaacgtata 

41401 cgattctaca agtctaaaga atggcaaata acaagaaaaa gagtgctaga aagagataat 

41461 tatgaatgcc aacaacgtaa gagagacggc aagctaacga cacatgacaa aagcaagcgt 

41521 aagtcgttgg atgtagatca tacattatcg ctagaacatc atccggagtt tgctcatgac 

41581 ttaaacaatt tagaaacact gtgtattaaa tgtcacaaca aaaaagaaaa gagatctata 

41641 aaaaaagaaa ataaatggaa agacgaaaaa tggtaaatac ccccgggtca aaaaaatcaa 

41701 aagcgacc 
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Name 


Position 


I 


77ORF005 


19572..21026 


2 


77ORF006 


3976..5196 


3 


77ORF007 


21871..23076 


4 


77ORF008 


2120..3307 


5 


77ORF009 


31946..32803 


6 


77ORF010 


26092..26889 


7 


77ORF011 


24441. .25208 


8 


77ORF012 


29788.30576 


9 


77ORF013 


33620..34399 


10 


77ORF014 


27760..285I2 


11 


77ORF015 


3291. ,4028 


12 


77ORF016 


32867.33610 


13 


77ORF017 


23269..23982 


14 


77ORF018 


31169..31840 


15 


77ORF019 


39851. .40501 


16 


77ORF020 


6926..7570 


17 


77ORF021 


37762.38304 


18 


77ORF022 


30605.31156 


19 


77ORF023 


26903. .27346 


20 


77ORF024 


10700,.lil40 


21 


77ORF025 


9707..10147 


22 


77ORF026 


40729..4U45 


23 


77ORF027 


6518..6925 


24 


77ORF028 


34795..35199 


25 


77ORF029 


6117..6521 


26 


77ORF030 


36478.36879 


27 


77ORF031 


39151..39546 


28 


77ORF032 


33892..34266 


29 


77ORF033 


5758..6120 


30 


77ORF034 


7886..8236 


31 


77ORF035 


19258..19560 


32 


77ORF036 


36876.37223 


33 


77ORF037 


102..446 


34 


77ORF038 


34908.35219 


35 


77ORF039 


37220..37528 


36 


77ORF040 


41377..41676 


37 


77ORF041 


35454.35753 


38 


77ORF042 


5490..5774 


39 


77ORF043 


29304..29564 


40 


77ORF044 


18481..18768 


41 


77ORF045 


5216..5500 


42 


77ORF046 


25663.-25935 


43 


77ORF047 


11159..11425 


44 


77ORF048 


28776..29039 


45 


77ORF049 


36013.36255 


46 


77ORF050 


35753.36007 


47 


77ORF051 


38931.39167 
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Name 


Position 


48 


77ORF052 


1762.J2013 


49 


77ORF053 


37521.37757 


50 


77ORF054 


22818..23060 


51 


77ORF055 


17546..17788 


52 


77ORF058 


18892..19122 


53 


77ORF059 


34564.34785 


54 


77ORF064 


29574.29795 


55 


77ORF065 


28528.-28746 


56 


77ORF066 


27494..27703 


57 


77ORF069 


38341.38547 


58 


77ORF070 


36269.36475 


59 


77ORF071 


40498..40701 


60 


77ORF072 


38735.38938 


61 


77ORF073 


30945.31148 


62 


77ORF074 


38544.38738 


63 


77ORF075 


13673..13870 


64 


77ORF077 


25357..25605 


65 


77ORF079 


29089..29280 


66 


77ORF080 


35204.35389 


67 


77ORF085 


24060.^4242 


68 


77ORF092 


39706.39876 


69 


77ORF094 


32226.32393 


70 


77ORF096 


13606.. 13773 


71 


77ORF098 


7092..7256 


72 


77ORF102 


29051. .2921 2 


73 


77ORF104 


34393.34551 


74 


77ORF109 


18282..18434 


75 


770RF112 


39543.39692 


76 


770RF117 


27361. .27501 


77 


770RF118 


38390.38530 


78 


77ORF120 


36059.36199 


79 


770RF124 


33699.33833 


80 


770RF128 


14221..14355 


81 


77ORF130 


15675..15806 


82 


770RF133 


8414..8542 


83 


77ORF140 


13113..13235 


84 


770RF147 


7029..7148 


85 


770RF149 


30668.30787 


86 


770RF151 


31837.31953 


87 


770RF155 


30278.30391 


88 


770RF157 


4044..4157 


89 


770RF167 


20692..20799 


90 


770RF175 


35717.35821 


91 


770RF176 


6836..6940 


92 


770RF178 


35390.35491 


93 


770RF179 


8318..8419 


94 


770RF182 


29268.-29564 
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Table 4 



77ORF01 7 sequence 

23 982 atgacgcataatatagaaaaacgcattaataaattaaaaacttct 

1 MTHNIEKRINKLKTS 

23937 ggaaatccaaaatttaaaaagttagattcagatattcactattta 

16 GNPKFKKLDSDIHYL 

23892 ctcaagagatttgaaggtgaaaaaaaccataaaggtttttatcca 

31 LKRFEGEKNHKGFYP 

23847 aagtttaaacaaggagaaatagtttttgtagatttcggtataaac 

46 KFKQGEIVFVDFGIN 

23802 gttaataaagaattttctaattcacactttgcaatagtgatgaat 

61 VNKEFSNSHFAIVMN 

23757 aaaaatgattctaatacggaggatatagtaaatgttattccctta 

76 KNDSNTEDIVNVI PL 

23712 tcctctaaagaaaacaaaaagtatttaaagatgaattttgatttg 

91 SSKENKKYLKMNFDL 

23667 aaatgggagtattatttaagattgtttttaaatttaattagcgcg 

106 KWEYYLRLFLNLI SA 

23622 caaaataattcagctatattaaaagaagttttcgataaaaaatac 

121 QNNSAILKEVFDKKY 

23577 caaaaaaacaacacagaattcatcactaaagattattttattgaa 

136 QKNNTEFITKDYFIE 

23532 tttatatctgatagtttagaaattgaaaataaattaaataaaatt 

151 FI SDSLEIENKLNKI 

23487 gacagaaacattaataacatagtatcagcaattgataaggtaaaa 

166 DRNINNIVSAI DKVK 

23442 aaattaaaaggtaatagttacgcttgcataaattctttccagccg 

181 KLKGNSYACINS FQP 

233 97 attagtaagtttcgcataagaaaagttttaccccaaaaaattaaa 

196 ISKFRIRKVLPQKIK 

23352 aatccagtaatagattcttcggatattatgttactgataaataga 

211 NPVIDSSDIMLLINR 

23307 attaataataatatattgcagatccctgatataagatga 23269 

226 INNNILQIPDIR* 
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Physico-chemical parameters of ORF 77ORF017 

1 MTHNIEKRIN KLKTSGNPKF KKLDSDIHYL LKRFEGEXNH KGFYPKFKQG EIVFVDFGIN 

61 VNKEFSNSHF AIVMNXJTOSN TEDIVNVIPL SSKENKKYLK MNFDLKWEYY LRLFLNLISA 

121 QNNSAILKEV FDKKYQKNNT EFITKDYFIE FISDSLEIEN KLNKIDRNIN NIVSAIDKVK 

181 KLKGNSYACI NSFQPISKFR IRKVLPQKIX NPVIDSSDIM LLINRINNNI LQIPDIR 



Number of amino acids: 237 

Average molecular weight (Daltons): 27887.38 

Mean amino acid weight (Daltons): 1 1 7.67 

Monoisotopic molecular weight (Daltons): 27869.83 

Mean amino acid monoisotopic weight (Daltons): 1 1 7.59 

Amino acid composition 



Aci 
d 


Symbo 
1 


Numb 
er 


% 


Average % 
in Swissprot 


Aci 
d 


Symbo 
1 


Numb 
er 


% 


Average % 
in Swissprot 


Ala 


A 


5 


2.11% 


7.58% 


Cys 


C 


1 


0.42% 


1.66% 


Asp 


D 


14 


5.91% 


5.28% 


Glu 


E 


13 


5.49% 


6.37% 


Phe 


F 


16 


6.75% 


4.09% 


Gly 


G 


6 


2.53% 


6.84% 


His 


H 


4 


1.69% 


2.24% 


He 


I 


29 


12.24 

% 


5.81% 


Lys 


K 


33 


13.92 
% 


5.95% 


Leu 


L 


19 


8.02% 


9.42% 


Met 


M 


4 


1.69% 


2.37% 


Asn 


N 


30 


12.66 

% 


4.45% 


Pro 


P 


7 


2.95% 


4.9% 


Gin 


Q 


6 


2.53% 


3.97% 


Arg 


R 


8 


3.38% 


5.16% 


Ser 


s 


17 


7.17% 


7.12% 


Thr 


T 


5 


2.11% 


5.67% 


Val 


V 


11 


4.64% 


6.58% 


Trp 


W 


1 


0.42% 


1.23% 


Tyr 


y 


8 


3.38% 


3.18% 



Number of acidic (negative) amino acids (ED): 


27 




11.39% 


Number of basic (positive) amino acids (KR): 


41 




17.30% 


Total charge (KRED): 


68 




28.69% 


Net charge (KR- ED): 


14 




5.91% 


Theoritical pi: 


10.01 


Total linear charge density: 


0.30 


Average hydrophobicify: 


-5.37 


Ratio of hydrophilicity to hydrophobicity: 


1,41 


Percentage of hydrophilic amino acid: 


57.81% - 


Percentage of hydrophobic amino acid: 


42.19<& 


Ratio of %hydrophilic to %bydrophobic: 


1.37 
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77ORF019 sequence 

3 9851 atgaacgagcaaataataggaagcatatatactttagcaggaggt 

1 MNEQI IGSIYTLAGG 

39896 gttgtgctttattcagttaaagagatttttaggtattttacagat 

16 VVLYSVKEIFRYFTD 

39941 tctaacttacaacgtaaaaaaatcaatttagaacaaatatatccg 

31 SNLQRKKINLEQIYP 

39986 atatattt aga 1 1 g 1 1 1 1 aaaaaggc t aaaaaga t ga t tggagc t 

46 IYLDCFKKAKKMIGA 

40031 tatattattccaacagaacagcatgaatttttagatttttttgat 

61 YIIPTEQHEFLDFFD 

40076 attgaagtctttaataatttagataagcaaagtaaaaaagcgtat 

76 IEVFNNLDKQSKKAY 

40121 gaaaatgttattggatttagacaaatgattaatttatcaaataga 

91 ENVIGFRQMINLSNR 

40166 gttaaggcaatggaagattttaagatgagtttcaacaatgaattt 

106 VKAMEDFKMS FNNE F 

40211 agtacaaatcagattttttttaatccttcttttgttatggaaaca 

121 STNQIFFNPSFVMET 

40256 attgctattataaatgaatatcaaaaagatatatcttatttaaaa 

136 IAIINEYQKDISYLK 

40301 aatataattaataaaatgaatgaaaatagagcttataatcatatt 

151 N I INKMNENRAYNHI 

40346 gatagttttatcacttcagagtaccgacgaaaaataaacgattat 

166 DSFITSEYRRKINDY 

40391 aatctttatcttgataaatttgaagaacagtttagtcaaaagttt 

181 NLYLDKFEEQFSQKF 

40436 aaaataaacagaacttcgataaaagaaagaattattattaattta 

196 KINRTSI KERI I INL 

40481 aacaagaggagatttaaatga 40501 

211 N K R R F K * 
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Physico-chemical parameters of ORF 77ORF019 

1 MNEQIIGSIY TLAGGWLYS VKEIFRYFTD SNLQRKKINL EQIYPIYLDC FKKAKKMIGA. 

61 YIIPTEQHEF LDFFDIEVFN NLDKQSKKAY ENVIGFRQMI NLSNRVKAME DFKMSFNNEF 

121 STNQIFFNPS FVMETIAIIN EYQKDISYLK NIINKMNENR AYNHIDSFIT SEYRRKINDY 

181 NLYLDKFEEQ FSQKFKINRT SIKERIIINL NKRRFK 



Number of amino acids: 216 

Average molecular weight (Daltons): 26026.06 

Mean amino acid weight (Daltons): 1 20.49 

Monoisotopic molecular weight (Daltons): 26009.34 

Mean amino acid monoisotopic weight (Daltons): 120.41 



Amino acid composition 



Aci 
d 


Symbo 
I 


Numb 
er 


% 


Average % 
in Swissprot 


Aci 
d 


Symbo 
1 


Numb 
er 


% 


Average % 
in Swissprot 


Ala 


A 


7 


3.24% 


7.58% 


Cys 


C 


1 


0.46% 


1.66% 


Asp 


D 


10 


4.63% 


5.28% 


Glu 


E 


16 


7.41% 


6.37% 


Phe 


F 


19 


8.80% 


4.09% 


Gly 


G 


5 


2.31% 


6.84% 


His 


H 


2 


0.93% 


2.24% 


He 


I 


28 


12.96 
% 


5.81% 


Lys 


K 


22 


10.19 
% 


5.95% 


Leu 


L 


12 


5.56% 


9.42% 


Met 


M 


7 


3.24% 


2.37% 


Asn 


N 


23 


10.65 
% 


4.45% 


Pro 


P 


3 


1.39% 


4.9% 


Gin 


Q 


10 


4.63% 


3.97% 


Arg 


R 


11 


5.09% 


5.16% 


Ser 


s 


13 


6.02% 


7.12% 


Thr 


T 


7 


3.24% 


5.67% 


Val 


V 


7 


3.24% 


6.58% 


Trp 


W 


0 


0.00% 


1.23% 


Tyr 


Y 


13 


6.02% 


3.18% 



Number of acidic (negative) amino acids (ED): 


26 




12.04% 


Number of basic (positive) amino acids (KR): 


33 




15.28% 


Total charge (KRED); 


59 




27.31% 


Net charge (KR- ED): 


7 




3.24% 


Theoriticai pi: 


9.52 


Total linear charge density: 


0.28 


Average hydropbobicity: 


-4.84 


Ratio of hydrophilicity to hydropbobicity: 


1.37 


Percentage of hydrophitic amino acid: 


54.17% 


Percentage of hydrophobic amino acid: 


45.83% 'li 


Ratio of %hydrophilic to %hydrophobic: 


1.18 * 
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77ORF043 sequence 

29304 atgtattacgaaataggcgaaatcatacgcaaaaatattcatgtt 

1 MYYEIGEI IRKNIHV 
29349 aacggattcgattttaagctattcattttaaaaggtcatatgggc 
16 NGFDFKLFILKGHMG 

2 93 94 atatcaatacaagttaaagatatgaacaacgtaccaat taaacat 
31 ISIQVKDMMNVPIKH 
29439 gcttatgtcgtagatgagaatgacttagatatggcatcagactta 
46 AYVVDENDLDMASDL 

29484 tttaaccaagcaatagatgaatggattgaagagaacacagacgaa 
61 FNQAIDEWIEENTDE 
29529 caggacagactaattaacttagtcatgaaatggtag 29564 

76 QDRLINLVMKW* 



WO 0Q/32S25 
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Physico-chemical parameters of ORF 77ORF043 

1 MYYEIGEIIR KNIHVNGFDF KLFILKGHMG ISIQVKDMNN VPIKKAYWD ENDLDMASDL 

61 FNQAIDEWIE ENTDEQDRLI NLVMKW 



Number of amino acids: 86 

Average molecular weight (Daltons): 10186.68 

Mean amino acid weight (Daltons): 1 1 8.45 

Monoisotopic molecular weight (Daltons): 1 0 1 80.02 

Mean amino acid monoisotopic weight (Daltons): 1 1 8.37 

Amino acid composition 



Aci 
d 


Symbo 
1 


Numb 
er 


% 


Average % 
in Swissprot 


Aci 
d 


Symbo 
1 


Numb 
er 


% 


Average % 
in Swissprot 


Ala 


A 


3 


3.49% 


7.58% 


Cys 


C 


0 


0.00% 


1.66% 


Asp 


D 


9 


10.47 

% 


5.28% 


GIu 


E 


7 


8,14% 


6.37% 


Phe 


F 


4 


4.65% 


4.09% 


Gly 


G 


4 


4.65% 


6.84% 


His 


H 


3 


3.49% 


2.24% 


lie 


I 


11 


12.79 
% 


5.81% 


Lys 


K 


6 


6.98% 


5.95% 


Leu 


L J 


6 


6.98% 


9.42% 


Met 


M 


5 


5.81% 


2.37% 


Asn 


N 


8 


9.30% 


4.45% 


Pro 


P 


1 


1.16% 


4.9% 


Gin 


Q 


3 


3.49% 


3.97% 


Arg 


R 


2 


2.33% 


5.16% 


Ser 


s 


2 


2.33% 


7.12% 


Thr 


T 


1 


1.16% 


5.67% 


Val 


V 


6 


6.98% 


6.58% 




W 


2 


2.33% 


1.23% 


Tyr 


Y 


3 


3.49% 


3.18% 



Number of acidic (negative) amino acids (ED): 


16 




18.60% 


Number of basic (positive) amino acids (KR): 


8 




9.30% 


Total charge (KRED): 


24 




27.91% 


Net charge (KR - ED): 


-8 


9.30% 




Tbeoritical pi: 


4.38 


Total linear charge density: 


0.30 


Average hydrophobicity: 


-2.80 


Ratio of bydrophiiicity to hydrophobicity: 


1.19 


Percentage of hydrophilic amino acid: 


48.84% 


Percentage of hydrophobic amino acid: 


51.16% 


Ratio of %hydrophilic to %hydrophobic: 


0.95 
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77ORF102 sequence 

2 9051 atgagcaacatttataaaagctacctagtagcagtattatgcttc 
1 MSNIYKSYLVAVLCF 
29096 acagtcttagcgattgtacttatgccgtttctatacttcactaca 
16 TVLAIVLMPFLYFTT 
29141 gcatggtcaattgcgggattcgcaagtatcgcaacattcatgtac 
31 AWSIAGFAS IATFMY 
29186 tacaaagaatgctttttcaaagaataa 29212 

46 YKECFFKE*~ 
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Physico-chemical parameters of ORF 77ORF102 

1 MSNIYKSYLV AVLCFTVLAI VLMPFLYFTT AWSIAGFASI ATFMYYKECF FKE 



Number of amino acids: 53 

Average molecular weight (Daltons): 6155.42 

Mean amino acid weight (Daltons): 1 16.14 

Monoisotopic molecular weight (Daltons): 61 5 1 .07 

Mean amino acid monoisotopic weight (Daltons): 1 16.06 

Amino acid composition 



Aci 
d 


1 


Nnmh 
nuiuu 

er 


% 


rlVCl agt /o 

in Swissprot 


d 


oy in uo 

1 


Numb 
er 


% 


Average % 
in Swissprot 


Ala 


A 


6 


1 1 32 
% 


7.58% 


Cys 


c 


2 


1 11 
5.11 

% 


1.66% 


Asp 


D 


0 


0.00% 


5.28% 


Glu 


E 


2 


3.77 
% 


6.37% 


Phe 


F 


7 


13.21 
% 


4.09% 


Gly 


G 


1 


1.89 
% 


6.84% 


His 


H 


0 


0.00% 


2.24% 


He 


I 


4 


7.55 
% 


5.81% 


Lys 


K 


3 


5.66% 


5.95% 


Leu 


L 


5 


9.43 
% 


9.42% 


Met 


M 


3 


5.66% 


2.37% 


Asn 


N 


1 


1.89 
% 


4.45% 


Pro 


P 


1 


1.89% 


4.9% 


Gin 


Q 


0 


0.00 

% 


3.97% 


Arg 


R 


0 


0.00% 


5.16% 


Ser 


s 


4 


7.55 
% 


7.12% 


Thr 


T 


4 


7.55% 


5.67% 


Val 


V 


4 


7.55 
% 


6.58% 


Tip 


W 


1 


1.89% 


1.23% 


Tyr 


Y 


5 


9.43 
% 


3.18% 



Number of acidic (negative) amino acids (ED): 


2 




3.77% 


Number of basic (positive) amino acids (KR): 


3 




5.66% 


Total charge (KRED): 


5 




9.43% 


Net charge (KR - ED): 


1 




1.89% 


Tbeoritical pi: 


8.18 ... 


Total linear charge density: 


0.13 - 


Average hydrophobicity: 


10.81 


Ratio of hydrophilicity to hydrophobicity: 


0.40 


Percentage of hydrophilic amino acid: 


28.30% 


Percentage of hydrophobic amino acid: 


71.70% 
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Ratio of %hydrophilic to %hydrophobic: 0.39 
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77ORF104 sequence 

34393 atggtaaccaaagaatttttaaaaactaaacttgagcgttcagat 
1 MVTKEFLKTKLECSD 
34438 atgtacgctcagaaactcatagatgaggcacagggcgatgaaaat 
16 MYAQKLIDEAQGDEN 
34483 aggttgtacgacctatttatccaaaaacttgcagaacgtcataca 
31 RLYDLF IQKLAERHT 
34528 cgccccgctatcgtcgaatattaa 34551 

46 RPAIVEY* 
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Physico-chemical parameters of ORF 77ORF104 

1 MVTKEFLKTK LECSDMYAQK LIDEAQGDEN RLYDLFIQKL AERHTRPAIV EY 



Number of amino acids: 52 

Average molecular weight (Daltons): 6193.13 

Mean amino acid weight (Daltons): 1 19.10 

Monoisotopic molecular weight (Daltons): 61 89. 12 

Mean amino acid monoisotopic weight (Daltons): 1 1 9.02 

Amino acid composition 



Aci 
d 


Symbo 
1 


Numb 
er 


% 


Average % 
in Swissprot 


Aci 
d 


Symbo 
1 


Numb 
er 


% 


Average % 
in Swissprot 


Ala 


A 


4 


7.69 

% 


7.58% 


Cys 


C 


1 


1.92% 


1.66% 


Asp 


D 


4 


7.69 
% 


5.28% 


Glu 


E 


6 


11.54 

% 


6.37% 


Phe 


F 


2 


3.85 
% 


4.09% 


Gly 


G 


1 


1.92% 


6.84% 


His 


H 


1 


1.92 

% 


2.24% 


He 


I 


3 


5.77% 


5.81% 


Lys 


K 


5 


9.62 
% 


5.95% 


Leu 


L 


6 


11.54 

% 


9.42% 


Met 


M 


2 


3.85 
% 


2.37% 


Asn 


N 


1 


1.92% 


4.45% 


Pro 


P 


1 


1.92 
% 


4.9% 


Gin 


Q 


3 


5.77% 


3.97% 


Arg 


R 


3 


5.77 
% 


5.16% 


Ser 


s 


1 


1.92% 


7.12% 


Thr 


T 


3 


5.77 
% 


5.67% 


Val 


V 


2 


3.85% 


6.58% 


Trp 


W 


0 


0.00 
% 


1.23% 


Tyr 


Y 


3 


5.77% 


3.18% 



Number of acidic (negative) amino acids (ED): 


10 




19.23% 


Number of basic (positive) amino acids (KR): 


8 




15.38% 


Total charge (KRED): 


18 




34.62% 


Net charge (KR- ED): 

3.85% 


-2 




Theoritical pi: 


5.03 


Total linear charge density: 


0.38 


Average hydrophobicity: 


-5.81 * 


Ratio of bydrophilicity to hydrophobicity: 


1.47 


Percentage of hydropnillc amino acid: 


53.85% 


Percentage of hydrophobic amino acid: 


46.15% 
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Ratio of %hydrophiIic to %hydrophobic: 1.17 
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770RF182 sequence 


29268 




atgttcaatacaaaacgaaaaacggaggaagccaagatgtat tac 


1 M 


n> 
r 




29313 




gaaataggcgaaatcatacgcaaaaatattcatgttaacggattc 


16 E 


I 


GEtlRKNIHVNGF 


29358 




gattttaagctattcattttaaaaggtcatatgggcatatcaata 


31 D 


F 


KLFILKGHMGIS I 


29403 




caagttaaagatatgaacaacgtaccaattaaacatgcttatgtc 


46 Q 


V 


KDMNNVPI KHAYV 


29448 




gtagatgagaatgacttagatatggcatcagacttatttaaccaa 


61 V 


D 


ENDLDMASDLFNQ 


29493 




gcaatagatgaatggattgaagagaacacagacgaacaggacaga 


76 A 


I 


DEWIEENTDEQDR 


29538 




ctaattaacttagtcatgaaatggtag 29564 


91 L 


I 


N L V M K W * 
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Physico-chemical parameters of ORF 770RF182 

1 MFNIKRKTEE VKMYYEIGEI IRKNIHVNGF DFKLFILKGH MGISIQVKDM MNVPIKHAYV 

61 VDENDLDMAS DLFNQAIDEW IEENTDEQDR LINLVMKW 



Number of amino acids: 98 

Average molecular weight (Daltons): 1 1691.50 

Mean amino acid weight (Daltons): 1 19.30 

Monoiso topic molecular weight (Daltons): 1 1683.84 

Mean amino acid monoisotopic weight (Daltons): 1 19.22 



Amino acid composition 



Aci 
d 


Symbo 
1 


Numb 
er 


0/ 


Average % 
in Swissprot 


Aci 
d 


Symbo 
1 


Numb 
er 


/o 


Average % 
in Swissprot 


/via 


A 

J\ 


J 


3.06 
% 


7 58% 


Cvs 


r 


o 


0.00% 


1 66% 


Asp 


D 


9 


9.18 
% 


5.28% 


Glu 


E 


9 


9.18% 


6.37% 


Phe 


F 


5 


5.10 
% 


4.09% 


Gly 


G 


4 


4.08% 


6.84% 


His 


H 


3 


3.06 
% 


2.24% 


He 


I 


12 


12.24 
% 


5.81% 


Lys 


K 


9 


9.18 
% 


5.95% 


Leu 


L 


6 


6.12% 


9.42% 


Met 


M 


6 


6.12 

% 


2.37% 


Asn 


N 


9 


9.18% 


4.45% 


Pro 


P 


1 


1.02 
% 


4.9% 


Gin 


Q 


3 


3.06% 


3.97% 


Arg 


R 


3 


3.06 
% 


5.16% 


Ser 


s 


2 


2.04% 


7.12% 


Thr 


T 


2 


2.04 

% 


5.67% 


Val 


V 


7 


7.14% 


6.58% 


Trp 


W 


2 


2.04 

% 


1.23% 


Tyr 


Y 


3 


3.06% 


3.18% 



Number of acidic (negative) amino acids (ED): 


18 


18.37% 


Number of basic (positive) amino acids (KR): 


12 


12.24% 


Total charge (KRED): 


30 . 


30.61% 


Net charge (KR - ED): 


-6 


6.12% 




Theoretical pi: 


4.76 - 


Total linear charge density: 


0.33 


Average hydrophobiciry: 


-3.89 


Ratio of nydrophilicity to hydrophobicity: 


1.28 
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Percentage of h ydrophilic amino acid: 5 1 .02% 

Percentage of hydrophobic amino acid: 48.98% 

Ratio of %hydrophilic to %hydrophobic: 1 ,04 
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TableS 



BLASTP 2.0.8 [Jan- 05 -1999) 

Query- sid|l00017jlan|77ORF017 Phage 77 ORF | 23269-23982 | -3 
{237 letters) 

Database: nr 

393, 67B sequences; 120,452,765 total letters 



Sequences producing significant alignments: 

gi| 4493 986 |emb|CAB39045 .1 | (AL034559) predicted using hexExon; .. 
gij 730607 |sp|P23250(RPIl^YEAST NEGATIVE RAS PROTEIN REGULATOR P.. 
gi|3097044|emb|CAA75299r(Y15035) KIR [Cowpox virus] 
gi |2146245|pir| 1S73794 hypothetical protein H91_orfi80 - Mycopl . . 
gij 83910 JpirJ |S04682 ribosoraal protein varl - yeast (Candida gl . . 
gi| 133135 |sp|P21358|RMAR_CANGA MITOCHONDRIAL RIBOSOMAL PROTEIN 
gi (2128843 Jpirj |H64475 hypothetical protein MJ1409 - Methanococ. . 
gi(51070l7 |gb|AAD39926.1|AF126285_2 (AF126285) RNA polymerase [.. 
gi | 2146210 jpirf (S73342 hypothetical protein E07_orfi66 - Mycopl.. 

Database: swissprot 

79,449 sequences; 28,874,452 total letters 



Score 


E 


(bits) 


Value 


41 


0.010 


38 


0.053 


38 


0.090 


38 


0.090 


37 


0.15 


37 


0.15 


36 


0.20 


36 


0.35 


35 


0.60 



Sequences producing significant alignments: 



Score E 
(bits) Value 



Sp|P23250 RPI1_YEAST NEGATIVE RAS PROTEIN REGULATOR PROTEIN. 38 0.014 

sp|P21358 RMAR_CANGA MITOCHONDRIAL RIBOSOMAL PROTEIN VAR1 . 37 0.040 

spj Q21444 LDLC_CAE£L LDLC PROTEIN HOMOLOG. 34 0.35 

Sp|P27240 RFAY_ECOLI LIPOPOLYSACCHARIDE CORE BIOSYNTHESIS PROT. 33 0.46 

Sp|P53192 YGC0 YEAST HYPOTHETICAL 27.1 KD PROTEIN IN ALK1 -CXB1 . 33 0.60 

splP32908 SMC1~YEAST CHROMOSOME SEGREGATION PROTEIN SMC1 (DA-B. 33 0.60 

Sp|P54683 TAGB DICDI PRESTALK- SPECIFIC PROTEIN TAGB PRECURSOR . 32 0.78 

sp|Q03100 CYAA~DICDI ADENYLATE CYCLASE, AGGREGATION SPECIFIC {. 32 0.78 
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BLAST? 2.0.8 (Jan-05- 1999} 



Query= sid | 100019 | lan | 77ORF019 Phage 77 ORF| 398S1-40501 [ 2 
(216 letters) 

Database: nr 

373,355 sequences; 114,214,446 total letters 



Sequences producing significant alignments: 



Score E 
(bits) Value 



gi j 3341966 |dbj |BAA31932| (AB009866) orf 59 [bacteriophage phi PVLJ 43 7 e-122 

gi|2689911 (AE000792) B. burgdorferi predicted coding region BB . . . 38 0.058 

gijll71589|emb|CAA64574l (X95275) frameshift (Plasmodium f ale ip. . . 37 0.10 

gi|4493986|emb[CAB39045.l| (AL034559) predicted using hexExon; ... 36 0.23 

gi| 141257 {sp|P18019fYPI9_CLOPE HYPOTHETICAL 14.5 KD PROTEIN (OR... 36 0.29 

gi|l33412|sp|P27059 |RPOB_ASTLO DNA-DIRECTED RNA POLYMERASE BETA... 35 0.51 

gi|312223l|sp(Q5885l|HISX_METJA HISTIDINOL DEHYDROGENASE (HDH) ... 3S 0-51 

gi(3649757|emb|CAB11106.lT (298547) predicted using hexExon; MA... 34 0.66 

gi|2688313 (AE001146) sensory transduction histidine kinase, pu... 34 0.87 



Database: 



swissprot 

79,449 sequences; 28,874,452 total letters 



Sequences producing significant alignments: 

Sp|P18019 YPI9 CLOPB HYPOTHETICAL 14 . 5 KD PROTEIN (ORF9) . 

sp 1 058851 HISX^METJA HISTIDINOL DEHYDROGENASE (EC 1.1.1.23) (H. 

8p|P27059 RPOB_ASTLO DNA-DIRECTED RNA POLYMERASE BETA CHAIN (E. 

spjQ02224 CENE HUMAN CENTROMERIC PROTEIN E (CENP-E PROTEIN) . 

Sp|P04931 ARP PLAFA ASPARAGINE-RICH PROTEIN (AG319) (ARP) {ERA. 

spj P18011 IPAB_SHIFL 62 KD MEMBRANE ANTIGEN. 

apj P18709 VTA2 XENLA VITELLOGENIN A2 PRECURSOR (VTG A2) [CONTA. 

Sp|Q64409 CP3H~CAVPO CYTOCHROME P450 3A17 (EC 1.14.14.1) (CYPI. 

sp|P21358 RMAR_CANGA MITOCHONDRIAL RIBOSOMAL PROTEIN VAR1 . 

sp|Q03945 IPAB~SHIDY 62 KD MEMBRANE ANTIGEN. 



Score E 
(bits) value 



36 


0.079 


35 


0.14 


35 


0.14 


34 


0.31 


33 


0.S3 


32 


0.69 


32 


0.90 


32 


0.90 


32 


0.90 


32 


1.2 



WO 00/32825 



PCT/IB99/02040 



170 



BLASTP 2.0.8 (Jan-05-1999) 



Query* sid| 100043 | lan | 770RF04 3 Phage 77 ORF | 29304 -29564 | 3 
(86 letters) 

Database: nr 

373,355 sequences; 114,214,446 total letters 



Sequences producing significant alignments: 

gi 1 3341947 j dbj | BAA31913 | (AB009866) orf 39 [bacteriophage phi PVL) 
gi|744518|prf | 12014422A FKBP-rapamyc in -associated protein [Homo.. 
gi| 1169736 |sp|P42346| FRAP RAT FKBP -RAPAMYCIN ASSOCIATED PROTEIN. . 
gi| 1169735 |sp|P42345|FRAP~HUMAN FKBP-RAPAMYC IN ASSOCIATED PROTE.. 
gi|3282239 (U88966) rapamycin associated protein FRAP2 (Homo sa. . 
gi | 38754 02 temb | CAA9 8122 | (Z73906) cDNA EST EMBL:D64544 comes f r. . 
gi|1084792 |pir| |S54091 hypothetical protein YPR070w - yeast (Sa.. 

Database: swissprot 

79,449 sequences; 28,874,452 total letters 



Score E 
(bits) value 



182 
32 
32 
32 
32 
31 
30 



6e-46 

0.84 

0.84 

0.84 

0.84 

2.5 

4.2 



Sequences producing significant alignments: 



sp|P42345 
sp|P42346 
sp|P34554 
S p|Q24118 
Sp{P80034 
Sp|P22922 
sp|Q44363 
Sp|P38255 
Sp|P55822 
sp{058482 
gpjP34252 



FRAP_HUMAN FKBP- RAPAMYCIN ASSOCIATED PROTEIN (FRAP) . 

FRAPJIAT FKBP -RAPAMYCIN ASSOCIATED PROTEIN (FRAP) (R. 

YNP1 CAEEL HYPOTHETICAL 42.2 KD PROTEIN T05G5.1 IN C. 

LIO DROME LINOTTE PROTEIN. 

ACH2_BOMMO ANTICHYMOTRYPSIN II (ACHY- II) . 

AlAT_BOMMO ANTITRYPSIN PRECURSOR (AT) . 

TRAA AGRT6 CONJUGAL TRANSFER PROTEIN TRAA. 

YBU5 ~YEAST HYPOTHETICAL 51.3 KD PROTEIN IN PH05-VPS1 . 

SK3B_HUMAN SH3BGR PROTEIN (21 -GLUTAMIC ACID-RICH PRO. 

YA82_METJA HYPOTHETICAL PROTEIN MJ1082. 

YKK8 YEAST HYPOTHETICAL 52.3 KD PROTEIN IN HAP4-AAT1. 



Score 


E 


(bits) 


Value 


32 


0.24 


32 


0.24 


28 


3.5 


28 


3.5 


28 


3.5 


28 


3.5 


28 


3.5 


27 


6.0 


27 


7.9 


27 


7.9 


27 


7.9 
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BLASTP 2.0.8 [Jan-05-1999] 

Query= sid) 100102 j lan|77ORF102 Phage 77 ORF|29051-29212 | 2 
(53 lecters) 

Database: nr 

373.355 sequences; 114,214,446 total letters 



Score E 

Sequences producing significant alignments: (bits) Value 

gi (3341946 |dbj (8AA31912 1 (ABO0986S) orf 38 [bacteriophage phi PVL] 96 3e-20 

gi| 4325288 jgb | AAD173 IS j (AF123593) voltage-dependent sodium cha .. . 28 7.1 

gi|2649684 (AE001040) A. fulgidus predicted coding region AF092 . 28 9.3 

Database: swissprot 

79,449 sequences; 28,874,452 total letters 

Score E 

Sequences producing significant alignments: (bits) Value 

sp|P42087 HUTM_BACSU PUTATIVE HISTIDINE PERMEASE. 26 7.1 

SpjP04775 CIN2_RAT SODIUM CHANNEL PROTEIN, BRAIN II ALPHA SUBU. . . 26 9.2 

gp|P42619 YQJF_ECOLI HYPOTHETICAL 17.2 KD PROTEIN IN EXUR-TDCC. . . 26 9.2 
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BLAST? 2.0.8 I Jan- OS- 1999 ] 

Qu«ry= sid| 100104 | lan| 77ORF104 Phage 77 ORF (34393-34 55 l|l 
<S2 letters) 

Database: nr 

373,355 sequences; 114,214,446 total letters 

Score E 

Sequences producing significant alignments: (bits) Value 

gi | 2315523 (AF016452) similar to the leucine-rich domains found... 29 4.2 
gij4377168|gbtAAD18990| (AE001666) CT711 hypothetical protein [... 29 5.4 
gi|388217l|dbj |BAA34445j (AB018268) KIAA0725 protein [Homosapi... 28 9.3 

Database: swissprot 

79,449 sequences; 28,874,452 total letters 

Score E 

Sequences producing significant alignments: (bits) Value 

sp(P04879 RRPP_VSVIG RNA POLYMERASE ALPHA SUBUNIT (EC 2.7.7.48. 27 5.4 

sp|P04880 RRPP__VSVIM RNA POLYMERASE ALPHA SUBUNIT (EC 2.7.7.48. 27 5.4 

Bp) 013946 CN7A_HUMAN HIGH-AFFINITY CAMP-SPECIFIC 3 \ 5 ' -CYCLIC . 26 7.1 

Sp(P35381 ATPA~DROME ATP SYNTHASE ALPHA CHAIN, MITOCHONDRIAL P. 26 9.3 

Sp[P54659 MVPBJDICDI MAJOR VAULT PROTEIN BETA (MVP - BETA) . 26 9.3 

sp|P40397 YHXC_BACSU HYPOTHETICAL OXIDOREDUCTASE IN APRE-COMK . 26 9.3 
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BLASTP 2.0.8 [Jan-0S-1999] 



Query- sid | 122748 | Ian) 770RF182 Phage 77 QRF) 29268-29564 | 3 
(98 letters) 



Database: nr 

393,678 sequences; 120,452,765 total letters 



Sequences producing significant alignments; 



Score E 
(bits) value 



gij 3341947 | dbj |BAA31913.1| (AB009866) orf 39 [bacteriophage phi . . 182 

gi|1084792|pirj jS54091 hypothetical protein YPR070v - yeast <Sa.. 35 

gij 1169736 |sp[P42346|FRAP_RAT FKBP-RAPAMYCIN ASSOCIATED PROTEIN, . 32 

gil744518|prf | |2014422A FKBP-rapamycin-associated protein [Homo.. 32 

gi|505133ljemb|CAB44 736.l| (AL049653) dJ647M16.2 (FK506 binding . . 32 

gi|4826730|ref | NP_004949 . 1 (pFRAPl | FK506 binding protein 12- rap. . 32 

gi|3282239 (U8B966) rapamycin associated protein FRAP2 (Homo sa. . 32 



Database: swissprot 

79,909 sequences ; 



29,054,478 total letters 



Sequences producing significant alignments: 



Sp|P4234S FRAP HUMAN FKBP-RAPAMYCIN ASSOCIATED PROTEIN (FRAP) . 
sp|P42346 FRAP - RAT FKBP-RAPAMYCIN ASSOCIATED PROTEIN (FRAP) (R. 
sp|P40557 YIAS~YEAST PUTATIVE DISULFIDE ISOMERASE YIL005W PREC 
Sp )024 118 LIO_DROME LINOTTE PROTEIN. 

sp|Q44363 TRAA AGRT6 CONJUGAL TRANSFER PROTEIN TRAA. 
sp|PB0034 ACH2~BOMMO ANTICHYMOTRYPSIN II (ACHY- II ) . 
Sp|P34554 YNP1~CAEEL HYPOTHETICAL 42.2 KD PROTEIN T05G5.1 IN C. 
8p|P22922 AlAT_BOMMO ANTITRYPSIN PRECURSOR (AT). 



Score 
(bits) 

32 
32 
29 
28 
28 
28 
28 
28 



Be-46 
0. 
1 . 
1. 
1. 
L. 
1- 



E 

Value 



.29 
29 
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Table 7 

Bacteriophage 3A, complete genome sequence 

1 caaacgctag caacgcggat aaatttttca tgaaaggggg tctttatatg aagttaacaa aaaaacagct 

71 aaaagaatat atagaagatt acaaaaaatc tgatgacata ttaattaatt tgtatataga aacatatgaa 

141 ttttattgtc ggttaagaga tgaacttaaa aatagtgatt taatgataga gcatacaaac aaggctggcg 

211 cgagcaatat tattaagaat ccattaagca tagaactgac aaaaacagtt caa&cactaa ataactcact 

281 caagtctatg ggtttaactg cagcacaaag aaaaaagata gctcaagaag aaggtggatt cggtgaetat 

351 taaagtttta aatgaacctt caccaaaact. attaacaaca tggtatgcag agcaagtcac tcaagggaaa 

421 ataaaaacaa gcaaatatgt tagaaaagaa cgtgagagac atcttagata tccagaaaac ggaggcaaat 

4 91 gggtatttga tgaagaatta gcgcatcgtc ctattcgact tacagaaaag ttttgtaaac cttccaaagg 

561 atctaaacgt caacttgtat cacagccatg gcaacatttt attatcggca gtttgtctgg ttgggttcat 

631 aaagaaacaa aactgegcag gtttaaagaa gctttgatat ttacggggcg aaaaaacggt aaaacaacca 

701 ctatttctgg ggecgctaac tatgctgtat cacaagatgg agaaaatggt gcagaaattc atttgttagc 

771 aaacgtaacg aaacaagcta ggattceatt tgacgaatct aaggcgatga ttaaagctag cccaaagctt 

841 gataaaaatt tcagaacatt aagagacgaa atccattatg acgcaacgat atcaaaaatt acgccccaag 

911 catcagatag egataagtta gatggattga atacacacat ggggattttc gatgaaattc acgaatttaa 

981 agactataaa ttgatetcag ctataaaaaa ctcaagagct gcaaggttac aacctcttct catccacatt 

1051 acgacagcag ggtatcaatc agatggccca cttgttgata tggtagaagc gggaagagac accctagatc 

1121 aaatcataga agacgaaaga acttettatt atttagcatc tttggatgat gacgatgata ttaatgactc 

1191 gccgaactgg ataaaagcaa atcccaacct aggtgtctct ataaatttag atgagatgaa agaagagtgg 

12S1 gaaaaagcca agagaacacc agctgaacgt ggagatttta taaccaaaag gtctaatatc tttgctaata 

1331 atgacgagac gagttttatt gattacceaa cactccaaaa aaataatgaa attgtttctt tagaagagct 

14 01 ggaaggcaga ccgtgcacga ttggttatga tttatcagaa acagaggact ttacagccgc gtgtgctact 

1471 tttgcgttag ataatggtaa agttgcagct ttatcgcatt cacggaetcc taagcacaaa gttgaatatt 

1541 ctaacgaaaa aataccctat agagaatggg aagaagatgg cttattaaea gcgcaagata agccttatat 

1611 tgactaccaa gatgttctaa atttggataat taagatgaat gagcattatg tagtagaaaa aattacttat 

1681 gatagagcga acgcattcaa actaaatcaa gagttaaaaa attacgggtt tgaaacggaa gaaacaagac 

1751 aaggagcttt gaccttgagc cctgcattga aggatttaaa agaaatgttt ttagatggga aaataatatt 

1821 taataataat ccctcaatga aatggtatat caataatgtt cagttgaaac tagacagaaa cggaaactgg 

1891 ttgccgtcta agcaaagcag atatcgtaaa atagatggct ttgcagcatt tttaaacaca tatacagata 

1961 ttatgaataa agttgtttct gatagtggtg aaggaaacat agagtttatt agtattaaag acataatgcg 

2031 ttaaggaggt gaatgttatc gcaaaagaga atattgtcac acgcataaag aaaaaattga tagacaattg 

2101 gattgatcag ccaacttcta agcttcatga ccttagccca tggaaaaata gatctttttg gggtgtaatt 

2171 aataatacgc ttgaaactaa tgaaacgata ttttcagcta ttacaaagtt atctaattcg atggctagtt 

2241 tgcccttgaa aacgtatgaa gactataaag tagttaatac agaagtatct gatttactta cagtgtcacc 

2311 gaataattct ctgagcagtt ttgattttat taatcaaatt gaaacaatca gaaatgaaaa aggtaaegca 

2381 tatgtgctaa ttgaacgaga catctatcac caaccatcaa agcttttctt attaaatcca gacgttgttg 

2451 aaatgttaat tgaaaaccaa tcacgtgaac ttcactattc cattcatgct gcaactggaa ataaattgat 

2521 cgctcataat atggacatgt tgcattttaa acacatcgtg gcatccaata tggtgcaagg cattagtccg 

2S91 atcgatgtgt tgaagaatac aactgatttt gacaatgcag taagaacctt taatcteaca gaaatgcaaa 

2661 aacctgattc ttccatgctt aaatatggte ccaatgtagg taaagaaaaa aggcagcaag tgttagaaga 

2731 ctccaaacag tactatgaag aaaacggtgg aatactattc caagagcctg gcgttgaaat cgaaccgtta 

2801 cctaaaaaat atgtctctga agatatagtg gcaagcgaga atttaacaag agaaagagta gctaacgttt 

2871 ttcaattgcc ctcagtactc ctaaatgcaa gatcaaatac aaatttcgcg aaaaatgaag agttaaacag 

2941 attttacttg cagcataccc tactgccaac cgtcaaacag tatgaagaag aatttaatcg gaaactactt 

3011 actaaaacag acagagaaaa aaacaggtat tttaaactta acgttaaacc ttatctaagg gctgacagtg 

3081 caacacaagc agaagtgtac ettaaagcag ttcgtagcgg ttactacacc acaaatgaca ttagagagtg 

3151 ggaagattta ccaccagctg aaggtggaga taagccgcta ataagcggtg atttataccc aattgaeacg 

3221 ccacttgaat taagaaaatc tttgaaaggt ggtgataaaa atgtcaatga aagctaagta ttttcaaatg 

3291 aaaagaaaat caaaaagtaa aggtgaaaca tttatttatg gtgatattgt aagtgataaa tggtttgaaa 

3361 gtgatgtaac tgctacagat ttcaaaaata aactagatga acxaggagac atcagtgaaa tagatgttca 

3431 tataaattca tctggaggca gtgtatttga agggcatgca atatacaata tgctaaaaat gcatcctgca 

3S01 aaaattaata tctatgtcga tgccttagcg gcatcaattg ctagtgctat cgccatgagt ggtgacacta 

3S71 tttttatgca caaaaatagt tttttaatga cccataattc atgggttatg actgtaggta atgcagaaga 

3641 gctaagaaag acagcggatt tactegaaaa aacagatgcc gttagtaatt cagctcactt agataaagca 

3111 aaagattcag atcaagaaca cttaaaacag atgttagatg cagaaacttg gcttactgca gaagaagcct 

3781 tgtctttcgg cttgatagat gaaattttag gagctaacga aataactgct agtatctcta aagagcaata 

3851 taagcgtttc gagaacgtcc cagaagattt aaagaaagat gtagacaaaa tcactaaaat cgacgatgta 

3921 gatacgtttg aactggctga aacacccaaa gaaagtatgt cactagaaga aaaagaaaaa agagaaaaaa 

3991 ttaaacgcga atgcgaaact ttaaaaatga caatgagtta ctaggaggaa atgaaatgcc gacattatat 

4061 gaattaaaac aatcettagg tatgattgga caacaactaa aaaataaaaa tgatgaaetg agtcagaaag 

4131 caacagaccc aaatattgat attggaagaca tcaaacaact agaaacagaa aaagcaggct tacaacaaag 

4201 atttaacatt gttgaaagac aagtaaaaga cattgaagaa aaagaaaaag cgaaagttaa agacacagga 

4271 gaagcttatc aatctttaaa tgatcatgag aagatggtta aagctaaggc agagttttat cgtcacgcga _ 

4341 ttttaccaaa tgaatttgaa aaaccttcaa cggaggcaca acgcttacta cacgctttac caacaggtaa 

4411 tgattcaggt ggtgacaagc tcttaecaaa aacactctct aaagaaactg tttcagaacc atttgctaaa 

4481 aaccaattac gtgaaaaagc tcgtctaacc aacactaaag gcttagagac tccaagagct ccacatactt 

4 551 tagacgatga tgacttcatc acagatgtag aaacagcaaa agaattaaaa ctaaaaggtg atacagttaa 

4 621 actcactact aataaattca aagtatctgc tgcaatttca gatactgtaa ttcatggatc agatgtagat 

4 691 ttagtaaact gggttgaaaa cgcactacaa ccaggtctag cagctaaaga acgtaaagat gccttagcag 
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4761 caagtectaa atctggatca gaecacatgt cattttacaa tggacctgtt aaagaagctg agggagcaga 

4 831 catgtacgat gccactatca acgccttagc agatttacat gaagatcacc gtgataacgc aacaatttat 

4901 atgcgacatg cggattatgt caaaattatt agtgctcttc caaatggaac aacaaatttc cttgacacac 

4 971 cagcagaaaa agcatttggc aaaccagtag tatttacaga tgcagcagtt aaacctatcg tgggagattt 

5041 caactatttt ggaattaact atgatggaac aacttatgac actgataaag atgttaaaaa aggcgaatat 

5111 ctgtttgtac taactgcacg gtatgatcag caacgtacat cagacagtgc attcagaate gcaaaagcaa 

51fll aagaaaatac aggcccatta cccagccaag ccccaaaagg ctaatgtaac agctaaggct aaatcagctg 

5251 caacatcagc cgaatagggg tgatgaaacg agtttagaag aaattaaatt gtggttgaga attgactata 

5321 atctcgaaaa tgacttaatt gaaggtctca ttcaatcggc eaagtctgaa ttaccattaa gtggggttcc 

5391 agattatgac aaagatgact tggaataccc gcttttttgt acagcgatca gatatatcat tgcaagagac 

5461 catgaaagtc gtgggcactc aaatgaccaa tctagaagca aggtttttaa cgaaaaggga tcgcaaaaaa 

5S31 cgatcctgaa atcaaaaaag cggtaggtga tttttaaatg gaatttaatg aatttaaaga tcgcgcatac 

5601 ctttttcaac atgtaaacaa agggccgtat ccagatgaag aggaaaaaat gaagttgtat agttgctttt 

5671 gtaaaataca taatccttct atgaaagaca gagaaatttt aaaagcgacc gaatcaaagt caggaccaac 

5741 cataattacg aggtcttcta aaattgaata tctaccacaa acaaatcact tagttaaaat tgacagaggc 

5B11 ttatattccg ataaactatt caacattaaa gaaataagaa ttgatacacc agatattggc tataatacag 

5881 tggttttatc agaaaaatga gtgtagaaat taaagggata cctgaagtgt tgaagaaatt agaatcggta 

5951 tacggtaaac aatcaacgca agctaagagt gatagagctt taaatgaagc atctgaattt tctacaaagg 

6021 ctttaaagaa agaattcgag agttttaaag atacgggtgc tagcacagaa gaaatgacta aatctaagcc 

6091 ttatacaaaa gcaggaagtc aagaaagagc tgttttaatt gaatgggtag gccctatgaa ccgcaaaaac 

6161 attactcact tgaatgaaca tggttataca agagatggaa aaaaatatac accaagaggt tttggagtta 

6231 ttgcaaaaac attagccgct aatgaacgga agtatagaga aattataaaa aaggagttgg ccagataaat 

6301 gaatatacta aacaccataa aagaaatttt attatctgat gcagagctcc aaacatatat aaattctaga 

6371 atatactatt ataaagtcac tgaaaatgct gaaacttcca aaccttttgt tgttattaca cctatttatg 

6441 atttaccttc agacttcatg tccgacaaac atcttagtga agaatactta attcaaacag atgtagaatc 

6S11 ttcaaataat cagaaaacaa ttgatataac aaaacgaata agatatctgt tatatcaaca aaatttaatt 

6581 caagcatcta gtcagttaga cgcttatttt gaagaaacta aacgttatgt gatgtcgaga cgttatcaag 

6651 gcataccaaa aaatatatat tataaaaatc agcgcatcga ataggtgtgc ettttaattt ttaaggagga 

6721 aacaagcaat ggcagaagga caaggttctt ataaagtagg ttttaaaaga ttatacgttg gagtttttaa 

6791 cccagaagca acaaaagtag ttaaacgcat gacatgggaa gatgaaaaag gtggtacagt tgatctaaac 

6861 atcacaggtt tagcaccaga tttagtagat atgtttgcat ctaacaaacg tgtttggatg aaaaaacaag 

6931 gtactaatga agttaagtct gacatgagra tttttaacat tccaagtgaa gatceaaata cagttattgg 

7001 tcgttctaaa gataaaaacg gtacatcttg ggcaggagag aatacaagag caccatacgt aacagttatt 

7071 ggagaatctg aagatggttt aacaggtcaa ccagtgtacg ttgcgctact taaaggtact tttagcttgg 

7141 attcaattga atttaaaaca cgaggagaaa aagcagaagc accagagcca acaaaactaa ctggtgaccg 

7211 gatgaacaga aaagttgatg ttgatggtac tccacaaggr atcgtatacg ggtatcacga aggtaaagaa 

7281 ggagaagcag aattcttcaa aaaagtattc gctggataca cggacagtga agatcattca gaggattctg 

7351 caagttcgtt acccagccaa cccccaaaat gttgaagtag cagttaatte aaaatctgca acagcttcag 

7421 cagaataggg gcttccaaaa taaatcaaag gagaataatt tatgaccaaa actttaaagg tttataaagg 

7491 agacgacgtc gtagcttctg aacaaggtga aggcaaagtg tcagtaactt tatctaattt agaagcggat 

7561 acaacttatc caaaaggtac ttaccaagtg gcatgggaag aaaatggtaa agaatctagt aaagttgatg 

7631 tacctcaatt caaaaccaat ccaattctag tctcaggcgt atcatttaca cccgaaacta aatcaatcac 

7701 ggtaaatgct gatgacaatg ttgaaccaaa cattgcacca agtacagcaa cgaataaaac gctgaaatat 

7771 acaagtgaac atccagagtt tgttactgtt gatgagagaa caggagcaat tcacggtgta gctgagggaa 

7841 cttcagttat cactgctacg tctactgacg gaagtgacaa gtctggacaa atttacagtaa cagtaacaaa 

7911 tggataatta tttgagacgc agaatatctg cgtctttttt atttgaataa aaggagctaa tacaatgatt 

7981 aaatttgaaa ttaaagaccg taaaacagga aaaacagaga gctatacaaa agaagatgtg acaatgggcg 

80S1 aageagaaaa atgctatgag tatttagaac tagtaaatca agagaataaa aaagaagtac ctaacgcaac 

8121 aaaaatgaga caaaaagagc gacagttatt agtagattta tttaaagatg aaggattgac cgaagaagat 

8191 gttttgaaca agatgagcac taaaacttat acaaaagcct tgaaagatat atttcgagaa atcaatggtg 

8261 aagatgaaga agattcagaa actgaaccag aagagatggg aaagacagaa gaacaatctc aataaaagat 

8331 attttatcga acattaagaa aatacaacgt ttctgtatgg agcagtatgg gtggaeatta actgaagtca 

8401 gaaaacagcc gtatgtaaaa cttttagaaa cacttaatga agagaataaa gaagagactg aagaaaaaca 

8471 aagtgaacaa aaagtcatta caggtaegga tttaagaaaa ctttttggaa gctagaaagg aggttaatat 

8541 gaatgaaaaa gtagaaggea cgaccttgga gctgaaatta gaccatttag gtgtccaaga aggcatgaag 

8611 ggtttaaagc gacaattagg tgttgttaat agtgaaatga aagctaatct gtcatcattt gacaagtctg 

8681 aaaaatcaat ggaaaagtat caggcgagaa ttaaggggtt aaatgataag cttaaagttc aaaaaaagat 

8751 gtattctcaa gtagaagatg agcttaaaca agttaacgct aattatcaaa aagctaaatc tagtgtaaaa 

8821 gatgttgaga aagcatattt aaagctagta gaagctaaca aaaaagaaaa attagctctt gataaatcta 

8891 aagaagcctt aaaatcttcg aatacagaac ttaaaaaagc cgaaaatcaa tataaacgta caaatcaacg 

8961 taaacaagat gcatatcaaa aacttaaaca gttgagagat gcagaacaaa agcttaagaa tagtaaccaa 

9031 gctactactg cacaactaaa aagagcaagt gacgcagtac agaagcagtc cgctaagcat aaagcacttg 

9101 ttgaacaata caaacaagaa ggcaatcaag ttcaaaaact aaaagtacaa aatgataatc tttcaaaatc 

9171 aaaegaaaaa atagaaaatt cttacgctaa aactaatact aaattaaagc aaacagaaaa agaatttaat 

9241 gatttaaata atactattaa gaatcatagc gctaatgtcg caaaagctga aacagctgtt aacaaagaaa 

9311 aagctgcttt aaataattta gagcgttcaa tagataaagc ttcacccgaa atgaagactt ttaacaaaga 

9381 acaaatgata gctcaaagtc atttcggcaa acttgctagt caagcggatg tcatgtcaaa gaaatttagt 

9451 tctattggag ataaaatgac ttccctagga cgtacgatga cgatgggcgt atctacaccg attactttag 

9521 ggctaggtgc agcattaaaa acaagtgcag acttcgaagg gcaaatgtct cgagttggag cgattgcaea 

9591 agcaagcagt aaagacttaa aaagcatgtc taatcaagcg gttgacttag gcgctaaaac aagtaaaagt _ 

9661 gctaacgaag ttgctaaagg catggaagaa ttggcagctt taggctttaa tgccaaacaa acaatggagg 

9731 ctatgccggg tgttatcagt gcagcagaag caagcggtgc agaaatggct acaaccgcaa ctgtaatggc 

9801 atcagcaatt aattctttcg gtttaaaagc atctgatgca aaccatgttg ctgatttact tgcgagatca 

9671 gctaatgata gtgctgcaga tattcaatac atgggagatg cattaaaata tgcaggtact ccagcaaaag 

9941 cattaggagt ctcaatagag gacacttctg cagcaactga agttttatct aactcagggt tagaggggtc 

10011 tcaagcaggt actgcattaa gagcttcgtt tattaggcta gccaatccaa gcaaaagcac agctaaggaa 
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10081 acgaaaaaat eaggtatcca tttgtctgac 

10151 agttccaaga caacatgaaa ggcatgacga 

10221 tgaagcagca ageggatttt tagccttgat 

10291 ttgaagaacc ctaatggtga aagtaaaaaa 

103S1 aacaattagg tggcgctttt gaatcgttag 

104 31 aggtgcggaa ggattaacaa aattagttga 
10S01 gcaggtccag cgacttttgg tgcatctatt 

105 71 tcggaagcgc ggctaaaggc tatgcatcat 
10641 caatccaaaa gcaatgaaat ctttaggtct 
10711 aaaggcctta aaggattagc cggagctatg 
10781 caaagctagc aattttaccg ttcaaacttt 
10851 agtaagtgga ggcgcaagat ttgctggtgc 
10921 actgctatta caattgcata taaagttttc 
10991 ttaacggttt aggagaaact ataaagtttt 
11061 agagtttaaa aattatcttg gaagtatagg 
11131 ggttataaat ctttgagtga cgatgacctt 
11201 ccatgggcac agcttctaaa aaagcacctg 
11271 agaaaaagct ttagaaaaat acgtacacta 
11341 aactcgggtc aaacaacaga agacaaagca 
11411 ttacagctga aatagaaaaa agaaacaaaa 
11481 tgcgttcgat gaacaagaaa agcaaaacat 
11551 aaagagcaag aactcaatca gaaaatcaaa 
11621 aaaatgaaag aaaagaaatt gaaaagcttg 
11691 gactgaaaaa gagcaagagc gtattttagt 
11751 gcgagcaaag caattaaaga agcagaaaaa 
11831 aagatgatgt cattgctata aaaaataacg 
11901 tgctgatcaa agacataagg atgaagtaag 
11971 aaaaagcaaa ataaagatat tgataaagag 
12041 agtggtggaa tggccttaaa agctggtggt 
12111 cgctaaagaa caagaagaaa cagctcgtag 
12181 gacggcgtaa aaactaa&ac cggcgaagcc 
12251 aaacgaaaaa aatgtggagt ggaaccaaag 
12321 aagttctgta ggatatcaca ctaaggctat 
12391 cctgttaaat cgactacagg aagtatttac 
12461 cttgggcgca ttcaaaatct atttggaaag 
12531 gggctggcca acggacatgg ccaataaatc 
12601 aatgcaaaat ccgtttggaa aggaacatcg 
12671 ctggagatat gtattcaaga gcccacgacc 
12741 atcagtattt aatggtttta gaaaatggct 
12811 atgggaagag ctgcggctga tttaggtaaa 
12861 ctggcggtat taataaaata tctaaagcca 
12951 caetggtact ttagcaggaa agggtgtagc 
13021 gtattaaatg atagaggttc tggaaacgcc 
13091 gaacacccca tgcaccccaa ggacgagatg 
13161 caatgacact ctgaagttac agcggatggg 
13231 tggceagacc aacttaaagg taatataggt 
13301 cgcataatat caaaaaaggt gcagaagaaa 
13371 ttggttaggc gataaaaccg gcgatgtgtg 
13441 atgtcaggtt caaatattaa ttttggaggc 
13511 actgctcaaa aagaaattaa cagacaaagt 
13581 agetatctat ttgaatatcc aatccggcaa 
13651 gtcgtcacta tggtatagac ettggtatgc 
13721 agacaaggca tggactgact acggtggcgg 
13791 tggtatatgc atttatctaa gcaattagca 
13861 aatcaggtgc tacaggtaat ttcgttagag 
13931 agggaatgat acagctaaag atccagaaaa 
14001 tcaggtgtta ataaggctgc atctgcttgg 
14071 atgttactcc gggtgatgta ggaaatatca 
14141 aactcaatct agttcgctta gagacatcaa 
14211 atcccacaaa catttagaca ttatgctgtt 
14281 cagcgttcct taacaacaga tattggcgct 
14351 aagaagatat gcgaatggtg gtttgactac 
14421 gagatggtta tccctctaac cagaegtaaa 
14491 gtatggatgg caagccaaac aacatcactg 
14561 aattgtcatg ttaagtgata aaggaaacaa 
14631 aataacttag gttctaatga tgcaattaga 
14701 caaatgcaaa taactatatg ggaggtttga 
14771 aaggaagaag taataacaga ttttaatcag 
14841 atgataacag tgtaactatt aacggagtag 
14911 attagtatta aggttcggct atgacggtat 
14981 tctgtgttta atcgcagaca tccctactat 
150S1 atacagccaa cgttacaccc aatctaaaag 
15121 taaagggtac tctgaatcag ttaatcggac 
15191 aatggaattc cccctgattt cacaccraaa 
15261 ctactgatac gacaaaccca cgattcaagc 
1S331 atctgaactg gttaactata caacaggcga 
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gccaaaggcc aatttgttgg cacgggtgaa ttgattagac 
gagaacaaaa actagcaaca gcggccacaa tagctggcac 
cgaagcgggc ceagataaaa ccaacagcca tagcaaatca 
gcagccgacc egatgaaaga caacctcaaa ggcgctctgg 
caactgaagt tggtaaagat ttaacgccca tgatcagagc 
cggatttaca catcttcctg gttggttcag aaaggcctcg 
ggccctgctg cccctgctgg tggcccatta acacgtgcag 
taaatagacg cattgctgaa aatacaatac tgtctaatac 
tcaaacctta tttcttggtt ctacaacagg aaaaacgtca 
ttgtctaatt taaaacctat aaatgttttg aaaaattctg 
tgaaaaacgg tttaggatta gccgcaaaat ccttatttgc 
agccctaaag cttttaacag gacctatagg tgctacaata 
aaaaccgcat acgatcgtgt ggaatggttc agaaacggta 
ttggtggcaa aattattggc ggtgctgtea ggaagctagg 
caaaagcttc aaagaaaagt tttcaaagga tacgaaagat 
ctgaaagtag gagtcaacaa gtttaaagga tttatgcaaa 
atactgtaaa agcgccgggg aaaggtgttt caaaagaaac 
ttctgaagag aacaacagaa ccatggaaaa agcacgtcta 
aaaaaacttt egaaaattga agcggateta tctaacaacc 
aggaactcga aaaaaetcaa gaacttattg ataagtatag 
tttaactaga actaaagaaa aaaatgactt gcgaattaaa 
gaattgaaag aaaaagcttt aagtgatggt cagatttcag 
aaaaccaaag acgtgacatc actgttaaag aattgagtaa 
aagaatgcaa agaaacagaa atgcttattc aatagacgaa 
gcaagaaaag caagaaaaaa agaagtggac aagcaatatg 
tcaacccttc taagtctgaa aaagataaat tattagctat 
aaaggcaaaa cctaaaaaag atgctgtagt agacgttgtt 
atggatccat ccagtggtcg cgtatataaa aacactgaaa 
ctaaccccag agaagaccaa aagaagaaaa gcgataagta 
aaacagagaa aatataaaga aatggtttgg aaatgcttgg 
cttagtaaaa cgggcagaaa egctaatcae tttggcggcg 
gaattccaag caaatcaagt tcaggttgga gcccagccaa 
agctaatagt accggCaaat ggtttggaaa agcccggcaa 
aatcaaacta agcaaaagta ttcagatgce tcagataaag 
ggacatcaaa atggtttagc aatgcatata aaagtgcaaa 
gcgctcgaaa tgggataaca tttctagtac agcatggtcg 
aaatggttta gcaactcata caaatcttta aaaggttgga 
gttttgacgc aatttcaagt tcggcatggt ctaacgctaa 
atcaagaaca tatgaatgga ttagagatat cggtaaagac 
aatgttgcta ataaagctat tggcggttta aatagcatga 
ttactgataa aaatcccatc aagccaatac ctacattgte 
caccgataat tcgggagcac caacgcaacc gacatttgct 
ccaggtggtg gagctcaaga agtaattcac agggctgacg 
tggttgttcc aceaggagtt ggagatagtg taataaatgc 
tgttttgcca aaattccatg gtggtacgaa aaagaaagat 
aaaaaagcag gagaatttgg agctacagct aaaaacacag 
tggttgaagc agcaggcgat aaaa tcaaag atggtgcatc 
ggattacgta caacacccag ggaaactagt aaacaaagta 
ggaccaacgc tacagcaaaa attgctaaag gcgcgtactc 
aaaaccgtgg ttegaagatt ctggtggtgg aggcgatgga 
agacttggac gccacacagg tggacttaac tctaatgacg 
ctactggaac aaacgtttat gccgttaaag gcggtatagc 
caactccaca caaatcaaga ccggtgctaa cgaatggaac 
agacaaggcc aacgcatcaa agctggtcaa ctgataggga 
gagcacaeee aeatetccaa ttgatgcaag ggtcacatcc 
atggttgaag tcacttaaag gtagtggcgt tcgaagtggt 
gcaggcgata tacgtcgtgc agcaaaacga acgggtgcta 
ttagcttgac tcaacacgaa tcaggaggaa atgcaggtat 
cgttttacag ggcaatccag caaaaggatt gcttcaatat 
agaggccaca acaatatata tagcggttac gaccagttat 
cacagtttaa cccaagaggt ggttggcctc caagtggtcc 
aaagcatcaa ctcgccgaag tgggtgaagg agacaaacag 
cgagcaatte a a ctaactga acaggttatg cgcatcatcg 
taaataatga tacttctaca gttgaaaaat tgttgaaaca 
atcaacagac gcattgactc aaaetgtttc tccccaggat 
ggtttagaaa aaatattgte aaaacaaagt gggcacagag 
ccaattaatg caatcttttg taaaaatcae agatggttac 
cctatacttt tagatgcaag ggctgaaagt ccaaacacca 
atggtatttt accgggcgca attagttttg cgccttttt'c 
agaegtcata gacccaaacc tactcgagca ttggtttaga - 
gttattacct ctcaaatgcc cggtgttaaa tatgcagtga 
atggctcctc aactgaaatt gaagraagtt taaatgttta 
cgacagcgag t tec tat teg actctaattg gatgtttgaa 
tatactcaca caccaaacca atttactact tggaacggtt 
acgatttgaa aatattaacc aattcaaacg cgagtggagg 
tatttttaag tacaacaaaa gtatagacaa aaacactgat 
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15401 tttgttttag atggtgtgca tgcacatcga gatataaata gagcgggaac cgatacaaac agaggcatta 

15471 caacattagc gccaggtaaa aacgaattta agattaaagg agacatcagt gatactaaaa ccacactcaa 

15541 gtttcctttc acctataggt aggtgattta atggaceacc atgaccactc atcagtaatg gatttcaatg 

15611 aatcgacttg tgaaaattta ctagatgtag actatggtcc ttctaaagaa tatcacgaac cgaatgaagc 

15681 taggtacatc acttttacag tttacagaac tactcataac agttccgttt tcgatttacc aatttgtgaa 

15751 aacctcataa tttatcatgg tgaaaaacac acaactaagc agacagcgcc aaaggttgaa ggtgataaag 

15621 ttcttattga agttacggca tatcacacaa tgtatgaact ccaaaatcac tcagtggaat caaataagct 

15891 tgatgacgac agtagcgaaa ctggcaaaac gccagaatac cccttagatg agcactcaag atacggattc 

15961 gcaaatcaaa aaacttcggt caaaatgacc tataaaataa ctggaaactc caagcgaaaa gcaccgattg 

16031 acgaattagg caacaaaaac ggcttagaac actgtaaaga agcggcagac ccatttggct gtataattta 

16101 cccaaaegat acggagatat gtttctattc tcctgaaaca tcctatcaaa gaagcgagaa agtgattcga 

16171 tatcaataca atactgatac tgtatctgca actgtcagta cattggaatt aagaacagct ataaaagttt 

16241 tcggaaaaaa gtatacagct gaggaaaaga aaaattataa tcctattaga acaaccgaca ttaaatattc 

16311 aaatggttct aeaaaagaag gtactcaccg taccgcaaca attgggtcta aagctactat taactttgat 

163B1 tgcaagtacg gtaatgaaac agttagatct acaataaaaa agggctccca aggtggaata cacaagttga 

16451 ttttagacgg caagcaaatt aagcaaattt cetgttttgc caagtcggtt cagtccgaaa caatagattt 

16521 aacaaaaaac attgataaag gcaagcacgc ttcagaaatg atatttttag gagaagaccc caaaaataga 

16591 actgatatac cttcaaataa aaaagctaag ccttgtatge atgttggaac egaaaaatca acagtcttaa 

16661 atttaattgc tgacaactca ggtcgcaacc aatacaaagc aattgttgac tacgtcgcag atagtgcaaa 

16731 gcagtttggg attcgatacg ctaatacgea aacaaatgaa gatatcgaaa cacaggacaa gctgttagaa 

16601 tttgcaaaaa agcaaataaa cgatactcct aagactgaat tagatgttaa ttatataggt tatgaaaaaa 

16871 cagagccaag agatagcgca ttctttgttc atgaactaac gggatataac actgaatcaa aggtcgttaa 

16941 acctgatagg tcacatccat ttgtaaacgc aatagatgaa gtgtctttca gcaatgaaat aaaggacatg 

17011 gtacaaaccc aacaagcgct taacagacga gttattgcac aagataatag atataaccat caageaaatc 

17081 gtataaatca tttatacact agtactttga attctccttt cgagacaatg gatataggga gtgtattaat 

17151 acaatggcaa cagaagaagt taaaatcaaa gcgctacttg aaaacgataa acagtacttt ccagctacac 

17221 attggaaagc tataaatggg ataccttatg caggcagtag tgatattgat ggattgcctc aagacggtat 

17291 caccccggta gatgataaaa ataaattaga taatttaaaa ataggcgaag caggaactat tcaaaatagc 

17361 accgtacaga aateeccaaa cggtaaattg tggaaaataa cagtcgacga tagtgggaaa cttggtacag 

17431 tgccatttta ttagaaagga aggtgcatta tggaaaattt gtatttaaca aaggatttgg gagcttcagc 

17501 aggccgagac tatagagcta aggaaataca aaacttacaa agaatagagc aatttgcgct tggctcgaca 

17571 acagagttta agttgcatca gaaagctaaa acaattcaac acttcgctga gcaaatttat tataacggta 

17641 gatcgcaagc agcagtaaac aaatctttac aaagtcaaat taacgcactt gtcgtggcac cacgtaataa 

17711 cagtgcraac gagattgttc aagctcgagt taatgtaaac ggcgaaacct ttgacacatt aaaagaacat 

17781 ttagacgatc gggaaaccca aactcaaact aataaagagg aaactacaag agaattaaat aagaccaaac 

17851 aagaaattct tgatatcgag tatcgctttg aacctgataa gcaagaattt ttatttgtga cagaacttge 

17921 acctcttaca aatgcagtaa tgcaatcctc ctggtttgat aatagaacag gcatagtata catgacacaa 

17991 gctagaaata atggctatat gctaagtcgt ctaagaccta atggtcaatc tatagacagc tcattgattg 

18061 taggtggggg tcatggtaca cacaacggtc atagatatat tgatgatgag ttatggactt atagttttat 

18131 ctcaaatggt aataacgaga atacattagc tcgtttcaag tatacgccta atgtggaaat cagctatggc 

18201 aagtatggta tgcaagatgt atttacagga cacccagaaa aaccctacat caeecctgtc acaaatgaaa 

18271 aagaaaataa aattctatac agaattgaga gacctagaag tcactgggaa cttgaaaact caatgaatta 

18341 tatagagata agaagtttag acgatgttga caaaaatatt gataaagttt tgcataaaat cagtatccct 

18411 acgagactaa caaacgaaac ccaaccaatg cagggtgtga cttttgatga aaaatacttg tattggtata 

184 81 caggagacag taatecaaat aatagaaact atttaacggc cttcgattta gaaacaggag aagaagcgta 

18551 tcaggttaat getgactatg gtggaacact agattcattt cctggcgaat ttgcggaagc agaaggtttg 

18621 caaatatact acgacaaaga cagcggcaaa aaagctttga cgctaggtgr tactgtcggt ggtgatggaa 

18691 acagaacaca tcgtatttec atgattgggc aaagaggtat tttagaaata cttcactcaa gaggcgttcc 

18761 ctttatcatg agtgacacag gtggtagagt taaaccttta ccaatgaggc ctgataaact taagaatctt 

18831 gggatgtcaa cagagceagg tctttactat tcatacactg atcatacagt ccaaatcgac gatttcccat 

18901 taccaagaga atggegtgat gcaggttggt tcttggaagt taagccacca caaactggcg gtgacgtaat 

18971 ccagatattg acgcgtaata gttatgcaag gaatatgatg actttcgaaa gggtgctttc tggaagaaet 

19041 ggagacactt cggactggaa ttatgtgcct aaaaatagtg gtaaatggga gagagtacct tcattcatca 

19111 caaaaatgtc agatattaac atagtaggca cgtcgtttca tttaactacg gatgatacaa aacgttttac 

19181 agattttcca actgaacgca aaggggtagc tggttggaac ttatacgtag aagcctcaaa cacaggcggc 

192S1 tccgttcaca ggctagttcg taatagtgtt acagcatctg ccgagacact attgaaaaat tatgatagta 

19321 aaacaagttc agggccatgg actttacacg aagggagaat tataagttaa tgagtaactt agagaaatet 

19391 gtagctataa atttagaaaa cacagcgcat tatgaaaaca cttcaaatct agacaeaact ttcagaacag 

19461 gagagagtga ttcttctgtt cttcttttta acaccaccaa aaataatcaa ccgttattat tgagtgaaga 

19S31 aaacatcaaa gcacgaatag cgattcgagg taaaggagtc atggtagtcg ctccaetaga aatattagat 

19601 ccatttaaag gtattttaaa acttcaatta cccaatgatg taattaaacg agatggaagt tatcaagctc 

19671 aagtttcggt tgcagaatea ggtaattcag acgtggtagt tgtcgagaga actatcacat ttaacgccga 

19741 aaaaagtttg tttagcatga ccccatctga aacaaaacca cactatattg ttgaatttca ggaatcagaa 

19811 aaaactatta tggatcgtgc gaaagcaacg gacgaggcta taaaaaatgg tgaagattat gcgagtctga 

19881 ctgaaaaagc taaagaaaaa ggtctatcag atattcaaat agcaaaatcc tcaagtatag atgaattaaa 

19951 gcaacttgct aacagccata catctgattt ggaaaataaa gcgcaagcac attcaagaac acccgatgag 

20021 caaaagcgat atatggacga gaaacatgaa gccttcaagc agccagtgaa tagtggtggt ttagccacaa 

20091 gtggctctac ttcaaattgg caaaaagcca agattactaa agatgatggc aagataacge agattactgg 

20161 atttgatttt aataatccag aacaaagaac aggtgattca acccaattta tttatgtttc gcaagctata 

20231 aatcacccaa gaggtgttag tactaacggt actgtcgaat atttagcagt aacttcagat tacaagcgta - 

20301 tgacttatcg accgaacggt acaaataaag tgtttgttaa aagaaaagaa gcgggctcac ggtctgagtg 

20371 gccagaatta gctattaatg actacaatac accttttgaa accgttcaaa gtgcccaatc aaaagceaat 

20441 acggccgaaa gtaacgctaa attatacgca gatgacaagt ttaataaaag gtatccggtt atttttgatg 

20511 gaacagcaaa tggtgtgggc tctacattgt acttaaatga gagtttagac caatttattt tattaacttt 

20581 ttatgggact tttccaggcg gtgactttac agagtttggc agtccttttg gaggaggaaa gacttcattg 

20651 aaccccccaa atcttccaga tggtgatgga aatggtggag gtgtttatga gtttggatta actaaatcta 
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20721 gtcgtacatc cttaactata tcaaacgatg 

20791 cgcaaataga gggacaatta acaaaatcac 

20861 atgagacaat ttcatacgct atcattggtg 

20931 tttctctcaa gctctcagac ctaaagcctt 

21001 tcagaagaaa aagatgactt gcatcaacag 

21071 tcttacgaaa aatggttgct agcacgcaga 

21141 taagcaaaat gcactaacgg caaaacaact 

21211 actgaaaacg ectaaactaa tctcaccaac 

21231 agtaaagaag acatagcgtg gcatgtagat 

21351 gagaaaagta cccagaaaac ccagagtcac 

21421 tggtgtaatg tttggattta ccaaacgaca 

214 91 aagactatgt ttgaaaaatt cgacagaata 

21561 tagatagaaa tttcgaagaa ctaaggcgtg 

21631 aaatattaga gacaccaaga tgtggactct 

21701 ctgctaaaaa ccatttttgg cactcaaagg 

21771 ttggtcgtgt tcctggttta gcaagtgtaa 

21841 tttggaaaaa aggagcaaac aaatggatgc 

21911 gtaaatcaac ccttagcgaa caaaggtatt 

21981 tacttaccgt tgttgcttta tatactacgt 

22051 tcaaaagcca aagaaatata aagctgaaaa 

22121 gtaatgacac ctacgaatat gaacgacaca 

22191 aaccaagcag aaaaatggct tgataateca 

22261 agtgteacga ttacgcaaat atgtttttta 

22331 taatattcca ttt'gataata aagcaaggat 

22401 ctaccgcaaa agttggacat tgtcgttttc 

22471 Ctgagagcgc taatctaaac actttcacat 

22541 cgttgcgcaa cctggteggg gtcccgaaac 

22611 tttattagat taaatttccc agacaaagca 

22681 ctgccaaaaa gcaagcagta actaaaccta 

227S1 eggagcagea ggaaacggaa caaacgaacg 

22821 tatttaagac atgccggtca tgaagtcgca 

22891 atacagcaca cggtgttaat gtaggtaata 

22961 tgacattgtt ctagaaacac atttagacgc 

23031 agtcaattca atgcagatac tattgataaa 

23101 gaggtgtaac acctcgtaac gatttactaa 

23171 atctgaatta ggtttcatca ctaataaaaa 

23241 aaactaatag ccggtgcgat tcatggtaag 

23311 ttaaaaacga aaagaatccg ccagtgccag 

23381 agaaactggt tattacacag ttgccaatgt 

234S1 agaattactg gtgtattacc caataacgca 

23521 gatggattac ctatattgct aatagtggac 

23591 taacagaata agcagtttcg gtaagtttag 

23661 cttcggtact tgcctatcat ttaaaattaa 

23731 aaacaaacgt ttttagtaea taaateattt 

23801 tcaactatat cgtggtttta tgrttattat 

23871 aegggttttt ttcgaaacaa tagtaaaaaa 

23941 aaatatttaa tcctatcaaa agttaaaaag 

24011 atgattttta tggtcaaaaa aagactatta 

24081 ctccgtttca cgaatccaaa gctgataaca 

24151 aacagaagae acaagcagcg ataagtgggg 

24221 aagtataaca aagacgcttt gatettaaaa 

24291 acaaaaacac agatcataca aaagcaacga 

24361 ccccaatgta gattcaataa attatctacc 

24431 ggttataaca taggcggtaa ttctaatagt 

24501 aaacaattag ttataataaa ataaaaagca 

24571 caggagtttt aggttacata ccatacaaac 

24641 tatcaaeacc cctgcactac cgactccccc 

24711 tttaataacg ttgatttaaa aaatttgaat 

24781 ctctatttat ctttgtctta acagtgtttg 

24851 taatataacc agaaagttta tgaaactgga 

24921 aacaacggta aaccagtatt taeagttata 

24991 aaaoctataa ttcagctggt agcgattccg 

25061 tttaccgtca aacgacgaat tgtatattaa 

25131 ttatatttaa egaatgaata ctaatctttt 

2S201 cggcgcccgg ctttccaaaa cttetgttca 

25271 cgccataaaa ttctcaccac cattcaacgt 

25341 gaatcttctt tggttaactt atceccatct 

25411 gtgacgataa acctttaggt aactcataag 

25481 agtttccttt tttattttgc aattagtcat 

25551 gtcttttata ttaaagcgcc acacaggcgc 

25621 ccaaacgaag cgactttgac atcatcacac 

25691 tatctacacg cttgataaga cttactccat 

25761 tCcettctta ataaaagcgt atgtcccctg 

25831 caaaaaccag cactcgatgg cgtttcgtct 

25901 attcctctcc catgccagca ccagttgcac 

2S971 acctatagaa gtgactttac tctgttcttc 
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tctatttcga cttaggaagt caaagaggcc ctggtgcgaa 
aggagtgaga aaataatgca aaeattagtt aacaagcgta 
gctctgaaga aggtattgae atcgaaaact taccagaaaa" 
taaatattca aacggggaaa tagtttttaa cgaagatcat 
actgacagcg aagaacaaaa cacagtcgce cctgacgaca 
aacaagtcge tcaaagtaca aagctatcga tgcaagttaa 
cgtgacacct aacaaaaaac tagaagaggt taaaggagag 
actcgaagat attaaaacat ggtatcaatt gaaagaatat 
acggaagtta tagataaaga ggaatatgca attattacag 
aggctataat cttatggctt ttcaatttga acaaagcggg 
cgaacaagat tggcgtttaa cgcgattaga agaaaatgat 
gaagacagtc cgagaacgca agaaaaaatt tatgacaagt 
acaaagaaga agatgaaaaa aacaaagaga aaaatgctaa 
aggaetaata gggacgatcc taagtacatt tgteatagcc 
aggtgactac catgcttaag ggaattttag gatatagctt 
gtaatagtta agagtcagtg cttcggcact ggctttttat 
aaaagtaata acaagataca tcgtattgac cteagcatta 
agcccgattc cagtagacga tgagaataca tcatcaataa 
ataaagacaa tccaacacct caagaaggca aatgggcaaa 
caagtataga aaagcaacag ggcaagcgcc aattaaagaa 
aacgatttag ggtaggtgtt gaccaatgtt gataacaaaa 
ttagggaagc agttcaatcc tgacttgttt tatggatttc 
tgatagcaac aggcgaaagg ttacaaggtt tatacgctta 
cgaaaaatac gggcaaataa ttaaaaacta tgatagctct 
ccgtcaaagt acggtggcgg agctggaeat gttgaaattg 
cgtctggcca aaactggaat ggtaaaggte ggacaaatgg 
cgttacaaga catgttcatt attacgatga cccaacgtat 
agtgttggag acaaagccaa aagcgxtact aagcaagcaa 
aaaaaattat gcttgtagcc ggecatggtt ataacgatcc 
cgattttata cgtaaatata taacgccaaa taccgctaag 
Ctatatggtg gctcaagtca atcacaagac atgcatcaag 
aaaaagatta tggcttatat tgggttaaat cacaggggta 
agcaggagaa agcgcaagtg gtgggcatgt cattatctca 
agtacacaag atgttattaa aaataactta ggacaaataa 
atgttaacgt atcagcagaa ataaatataa attaccgctt 
tgatatggat tggattaaga aaaactatga cttgtattct 
cctatcggtg gtgtgacatc cagtgaggtt aaaacaccag 
caggttatac acccgataaa aataatgtac cgtataaaaa 
taaaggtaat aacgtaaggg acggctattc aactaattca 
acaatcaaat atgacggcgc atattgtatc aatggctata 
aacgtcgtta Cattgccaca ggagaggtag acaaggcagg 
tgcagtttga taattgtata tgacgaatct taggcaggta 
taaacagtta atttttacat gaatatatta aattttaaaa 
tgtgttcgta ttgtgtgcta tgattaaaaa gttgttatgg 
caatcaaaat ataaattatt tataatttgt ttggtaatga 
acacatttgt agatatttta aacccggtaa atcttttaat 
gtttaatata aaaatgtaac aaaatttata aagaaaggaa 
gccgcaacac tgtcgttagg aataatcact cctattgcta 
acaccgagaa tactggtgac ggcgccgagg tagtcaaaag 
ggtcacacaa aacattcagt ttgattctgt taaagataaa 
atgcaaggtc ttatcaactc aaagactact tattacaatt 
ggtggccttt ccaatacaat attggtctca aaacaaatga 
caaaaataaa atagattcag caaatgttag ccaaacatta 
ggtccatcaa caggaggtaa tggttcatct aattattcaa 
ggtgataaga tgactcaatt tctaggggcg cttcttctta 
atctaacaac gataggttta gttagtgaaa aaaacaaggt 
tatcgaaaca cgtttgatac ggtcttatag ctccacaatt 
ttaattcagt tgcccacagg tctaaaagca aatattttgt 
tatttaatcc tttaatcgtc aaatttatta tctggttaat 
tcgtataagc tcattagaca aaagagacaa gttgtttaat 
aaagactttg aaaacagaat cattgaagag ggtgaactta 
atttactaga agttgagcga caagatttca aagtatctga 
acatacactt gtagacccta aacaacaaac caaactggat 
ttcttagctc ttcctgataa agtgcttttt aattcttcgc 
ttgggttact acgagtagct tcttgttttt tgtctttatc 
ctacacttgt aggcgctttt ttatttagta aagtcataat 
atcttttgtg aaataaaccc caagtattta cgcgcattat 
tgaatggttg attaccacca gttaaaacct catacaccac 
ttccattata aacctccttt caaacactgc tgaaatagac 
tgttaatcac aacacaactt tgcccattac cttaatacta, 
ttcggaccta gagataccaa attaatatag tcttcgcata 
ccaacacaac gagtgcaatc gtaccacctc taatagaatc 
ttttaacata ggttccattg aatcaccatt aaccaaaaca 
tctttaaaaa atacttcttc acgcaatatg ccaccacata 
cacatgcaac atacgatact agtttagact ctctaCatcc 
caattgctca cctgcatagt taagtacgtt ttcttggcgg 
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26041 ggaggtgtga gtttgctgta tatggaagtg 

26111 acaaaccatt aaccteeaca ttgaagtact 

26181 ctcactttcc cattttgata tcttgccttt 

26251 gttgctaatt gctccatagc catattttta 

26321 cgaaaccctc cttatataag ataatttcat 

26391 gcaaaagttg ttgacatcga aacttttatg 

26461 agggggttca atgacaacta gtgtagcaga 

26S31 ggaactaacc aaaaagaagc tgctaaagca 

26601 gaattaatgg cagagatttt acaacttcag 

26671 tgattttttt taaacctcaa gtttcgaaag 

26741 taacgttaac caaagaagag ttgaaagaaa 

26811 accaatcagc tcaggtgcaa ttttcagtaa 

26881 aaactcaact tcgcaaaaga tttgtcgcta 

26951 agtatcagca tggcttcgaa tcaattcatc 

27021 attaacatta tcaatttttg gagtgacact 

27091 aaaatttata gagatatcaa aaactattat 

27161 atgatttcga acgaaggagg aactacaaat 

27231 ttaattgacg tgtggcatgg aaatcaacgg 

27301 tctcggatag agaaggtaag aaatatctaa 

27371 ctactgcctt acaatcctaa atcctttctg 

27441 aaatgctgaa atagtcacga gcaacgctat 

27511 ctcgttatga atcttatgtc tatctagagc 

27581 tctaaatcca taaatttcac ctccttccac 

276S1 ggaacgacaa atgcaagctc aaaacaaaaa 

27721 ccattagata ttcaaattaa tgacggatat 

27791 aagaaatacc atacgtaaac aataacttat 

27861 Cttegagaaa gatattgaaa agctaatccc 

27931 ctcttttaac ttcgttccaa gtttcattgt 

28001 gtcatcaata atccaagaaa cgaccctgcc 

28071 tctaatttta aaagrgagca cattactgtt 

28141 tatgtcgagt aagtggttca cctattttct 

28211 gtgattaagt ttcatcctat cacctccata 

28281 aggaataaca aatgaacatt caagaagcaa 

283S1 agattggaaa gaaagtcatc gaactaagat 

28421 aatagcgatg ggacaaacct tatcagatac 

28491 aagttataaa cccaactaga gaccaggaat 

28561 aaattgtttt taaactcatt ctcaaagtaa 

28631 tactagcata cacgccgttt aggaacccag 

28701 atgtagtttt tgaaaatact ctgtatgtat 

28771 gaacctaacc ttacacatte taaataatct 

28841 gtgtatcaaa ttcateagat atcaagggca 

28911 aaggagcata aacaaatgaa cacaagatca 

28981 ctgatgctte ttcatcctat ttaacggaaa 

29051 aaaaagtgat tacagctact tagaaataaa 

29121 gcgaataaca acaaacttta acatttatct 

29191 caccagaaaa cacatatcga ggcgaagaaa 

29261 tcaattgttt ggagtatgta gaagtacagt 

29331 gtagaaaatt tatacattga ttattcagca 

29401 tgatcagaaa gcataaaaaa tggtactagg 

29471 agcagtgttg tgcttcacgg tctcagcgat 

29541 attgcgggat tcgcaagtat cgcaacattc 

29611 gcracttgtt ggagcaagta acagtgcaag 

29681 tatgacctta caacaaaaaa taccatcaca 

29751 gaagtttttg ggatatctaa aacacatgca 

29821 aattggaaag ttggggtatc tggcgtgttg 

29891 agagatatta gaagaacaat tcgagttact 

29961 gaagaacgca tcaagttaat gattcgttta 

30031 agaaggtatt tttgaagaat taaaactatt 

30101 gtagattcat caattgtaca agagaaagtt 

30171 aatcagttga agaagttaag gaaacttctg 

30241 gttccttaaa aaagcagata cttctgataa 

30311 aagctatcta ctatcaaaga agagcattat 

30381 gaagctagat cactcaaata gagctcatgc 

30451 ccaccgagta ttaaggcaag tgaaggtact 

30521 ctcatgagtt aagtgagtta tatttcagtc 

30591 ttttcaaaat tataagcgaa atcaatatta 

30661 aatgtagaag aaaaatacaa cgaagctttg 

30731 tggattcagg taaatacgtc cctgaatctt 

30801 tgaaattatt gaccttaaat acggtaaagg 

30871 taeggcttgg gcgcatacga actgcttagt 

30941 aaccacgaat agataacttt tctactgaag 

31011 tgttaaacca ttagccagac ttgcttataa 

31081 tgtaagataa agcattcatg tagaacacgt 

31151 tgttgagtga tgaagagatt gcagaacttt 

31221 agaaaaatat gcactagatc aagcgaaaga 

31291 cgctcgegaa gaatgataac tgatacaaat 



180 

acgtcgttat cgtctttgta tgcagtattt gattcaccac 
cagccaaaat tttggcagtt gacaaccgag gttecteett 
cgtcaattcc attaagtcgg gatatttatt atcaagatca 
ctttctccct agcttcttta aaccttcacc aatacccata 
tataaaagct ccgaaaacga aacgcaagga aaatattact 
acgcattctt aaatcaagtt gtcacaaacg aaacaaaagg 
taaaccatac ctaaaaataa aaagcttgat cgcactcaaa 
atcggaatga gtagaagttt attgagtata aagataaatc 
aagccaaaaa attagcagat catttaaatg ttaaagttga 
tgacaactaa ataaaaataa ggaggacact atggaacaaa 
ttatagcgaa agaagttaga aacgctataa aaggcgagaa 
agtaagaacc aacaatgacg atttagaaga aatcaataaa 
ggaagattga ggaagctcaa tcatccgatt ccgctaaaaa 
aaaaagctta tgtacaagat gttcatgacc atattagaaa 
taattcagac ttgagtgaaa gtgaatacaa cctagcagca 
ttatatatct atgaaaagag agtttcagaa ttaactatcg 
gaaactacta agaaggctat tcaataaaaa acacgaaaac 
ttaaaagtga aagaaagcaa attaaaaaaa tataaagtgg 
ctaaataagc gcacttaatt agtgcaagta atcaagtgcg 
ctttctccct cCtcttgCaa tcccaataac acagaagagt 
ctttagcgaa tgcaattacg tcatcaccga cttcttgcca 
cctaggtaac agcgagactg Caatatcgtg agcaatCttc 
tgggagataa ctaaattata taacaaaaca acttaaagga 
agtcatctat tactactatg acgaagaagg taataggcga 
gaactgatgg tccgatctca tttcatcaac aacaccattg 
atgccttggt tgatggttat gaatttaagt tagattgaat 
cccataagac taagagacac actggatgct ttgttaacga 
ctctaacact atcgagaaac tcatggccag accaagtgat 
ctcgatgaat ttcagatcgc aacaaataaa ctCagcttct 
tcaaaatcat atttatcaaa aataatatta tcgttgaaat 
tattagatcc tatttctaag agcaagagtc taacgcaatc 
acaggagtac agcagaaagg atcataaaca tcttaaaagg 
ctaagatagc tacaaaaaat cttgtctcta tgacacggaa 
attaccaaca aatgatagtt ttttacaatg catcatttca 
tggcaacctc cagccgatga cctcatggca aatgattggg 
tattgaagca actttagaaa tgctatcaat gatacttttt 
acaacagtct tgtctgaaat tgttacatga taaatagtgt 
agtttttaag tttatttaaa tcgtatttta catcttcgaa 
atctttagca cttccaaaat tattgcaggt taatttaacc 
ttgtagagta cggacaagat atattgttgg tctttagtaa 
tgttatcacc tccttaggtt gataacaaca ttatacacga 
gaaggattgc gtataggcgt cccacaagtt tctagcaaag 
aggaacgtaa cttaggagcg gaaatattag agcttattaa 
caaagttttc tatgcattag atagagaact tcaatacagg 
aaaggagtga tagagacgcc aaaaaccaca ataccaccaa 
aatttgtgaa aaagttatac gcaacaccta cacaaatcca 
acacaactgg ttgaaatatt accgtgaaga taatttaggt 
acgggaacat tgattaatat ttctaaatta gaagagtatt 
aggattatca aatgagcgac acatataaaa gctacctatt 
tgtactcatg ccgtttctat acttcactac agcatggtca 
atattttata aggaatactt ttatgaagaa taaagaaact 
atgagcaatt gtcttaaata actatataag gagttattaa 
tcttgcaaca tatgacaatt tcaattctga tgatgttgtt 
aaatccacac ttccaagact taagaaaaaa ggaaagattg 
ttgaaccgca gttacattta actgttgtag aacgCaagaa 
ggcaagatta aacgaacaaa gtgatgaccc cagagaaata 
gccaaccaat tttaaggagg agttaatcaa tggcaatatt 
aaacaagaat ttacgtgtgc taaatactga actatcaact 
aaagaagcac caatgccaaa agatgaaaca gctcaactgg 
ctgatttaac taaagattat gttttatcag taggaaaaga 
gaaagaattt agaaataaac ttaacgaact tggtgcggat 
gaaaaaattg tcgattttat gaatgcgaga ataaatgcat 
aaagcecagc gcaagtggag caaaacaacg gceaaactgt 
gcagataaaa gttcagtttt tgctgaagaa ggtacattcg 
ttaaatatga aggcctaaca cagtttgagt ttaataaagc 
cagtgaagag ttgcgcgaat acgttgaaga gtacgtagct 
agtagagatg acgatgtaat agctttattt gaaacaaaat 
ttggtactgg tgatgtcatt atattttcag gtggtgtact 
cattgaagtt ccagctatag ataatcctca acttagatta 
ttaatgtatg acattcatac agttcgcatg actatcatac _ 
agttaccaat atcaagatta cttcaatggg gaaccgattt 
cggtgaaggt gagtttaaag caggtagtca ttgtagattc 
gcagaataca cgcaaaatgt gccccaaaag ccaccacatt 
tatataaact gcctgacatc aaaaaatggg ctgatgaagt 
aaatgataaa aactattctg gttggaagct tgtagaaggt 
gcaacgcttg aaaagttagt tgaagcaggt tataaacctg 
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1S1 

31361 aagatattac agaaaccaag ttacttagca tcacgaactt agaaaaatta atcggcaaaa aagcatttcc 

31431 taaaattgca gaaggcttca tagaaaagcc acaaggtaaa ccaacacttg ctaccgagtc egaeaaacga 

31501 ccagctataa agcaacctgc cgaagatgat tttgacaaac tataaaaatt aaaaaggacg gtatataaac 

31571 acgaaagcaa aagtattaaa caaaactaaa gtgattacag gaaaagcaag agcatcatat gcacatattt 

31641 ctgaacctca cagtatgcaa gaagggcaag aagcaaagCa tccaaccagt ttaatcattc ctaaaccaga 

31711 tacaagtacg ataaaagcca ttgaacaagc tatagaagct gctaaagaag aaggaaaagt tagtaagttt 

31781 ggaggcaaag ttcctgcaaa tctgaaactt ecattacgcg atggagatac tgaaagagaa gatgatgtga 

31851 attaecaaga cgcttatttt attaacgcac caagcaaaca agcacctggt attaetgacc aaaacaaaat 

31921 tagattaacg gattctggaa ctategtaag tggtgactac attagagctt caatcaattt atttccattc 

31991 aacacaaatg gtaataaggg catcgcagtt ggattgaaca acactcaacc tgtagaaaaa ggcgaacctc 

32061 ttggcggtgc aagtgcagca gaagatgatc tcgatgaact agacactgat gatgaggatt tcttacaagt 

32131 caataggtgg ggtttttagc cccactttaa tcctaaagaa actgaggtgt caagaatttg aaacttatga 

32201 atatagacat egaaacatat agcagtaacg atatttcgaa atgtggtgtc tataaataca cagaagctga 

32271 agatttcgaa atcttaatta tagcttatcc aacagacggc ggaccgatta gtgcgattga caCgactaaa 

32341 gtagataatg agcctttcca cgctgattat gagacgttca aaattgctct atctgaccct gctgcaaaaa 

32411 agtatgcatt caatgctaat ttcgaaagaa cttgtcttgc taaacatttt aataaacaga tgccacctga 

32481 agaatggatt tgcacaatgg ttaattcaac gcgcattggc ttacctgctt cgcttgataa agtcggagaa 

32551 gttttaagac eacaaaacca aaaagataaa gcaggtaaaa attcaattcg ctatttctct ataccttgta 

32621 agccaacaaa agttaatgga ggaagaacaa gaaatttgcc tgaacatgat cttgaaaaat ggcaacaatc 

32691 cacagattac cgtacccgag atgtagaagt agaaatgaca attgctaata aaactaaaga cttcccagta 

32761 actgtaaccg aacaagcata ttgggttttt gaccaacaca taaacgacag aggtattaag ctttctaaat 

32831 cattgatgtt aggagctaat gtgctcgata agcagagtaa agaagaattg cttaaacaag ctaaacatat 

32901 aacaggttca gaaaatccta atagtcccac acagttattg gcttggttaa aggacgaaca aggactagac 

32971 atacctaatt tacaaaagaa aacggttcag gattacttaa aagcagcaac aggaaaagct aaaaaaacgc 

33041 cagaaactag attgcaaaCg tctaaaacca gtgtgaaaaa acacaacaaa atgcatgaca tgacgtgcag 

33111 tgatgaacgg gtaagaggtc tgtttcaatt ctacggtgcc ggtactggaa gatgggcagg tagaggtgta 

33181 caacttcaga atttaacaaa gcattatatt tcagacaccg aattagaaat agcaagagat cttattaaag 

33251 aacaacgctt tgacgattta gacctattac tcaatgtcca tcctcaagac ttattaagtc aattagttag 

33321 gacgacattc actgctgaag aaggtaatga actagcagta agtgattttt ctgcaataga ggcaagagtc 

33391 atagcatggr atgcaaaaga acaatggcgt ttagatgtgt tcaacacaca cggaaagata tatgaagcat 

33461 cggcttctca aatgtttaat gtaccggtag aaagcataac taaaggcgac cctctcagac aaaaaggaaa 

33531 agtgtccgaa ttagcctcag gctatcaagg tggcgctgga gctttaaaag caatgggtgc attggaaatg 

33601 ggcattgaag aaaacgagtt acaaggttta gttgatagtt ggcgtaacgc aaatcctaac atagtcaatt 

33671 tttggaaggc ccgccaagag gctgcaacta ataccgtaaa atcccgaaag acgcatcata cacacggact 

33741 tagattttat acgaaaaaag gttttctaat gattgaactg cctagtggaa gagctttagc ttatccaaaa 

33811 gctttagttg gtgaaaatag tcggggtagc caagttgttg aacttatggg gttagatctt aaccgtaaat 

33881 ggtcaaagtt aaaaacgtat ggtgggaagt tagtcgagaa tattgttcaa gcaactgcaa gggatttact 

33951 tgcgattcct atagcaaggc ttgaagcact aggttttaaa atagttggcc atgtccatga tgaagtaatt 

34021 gtagaaatac ctagaggttc aaatggactt aaggaaatcg aaactatcat gaataagcct gttgactggg 

34091 caaaaggatt gaatttgaat agtgacgggt ttacttctcc gttttatatg aaggattagg agtgtgatcg 

34161 catgcaacac caagcttata tcaatgcttc tgttgacatt agaattccta cagaagtcga aagtgttaat 

34231 tacaatcaga ttgataaaga aaaagaaaat ttggcggacc atttatctaa taaeccaggt gaactattaa 

34301 aatataacgt tataaatatt aaggttctag attcagaggt ggaatgatgg ctagaagaaa agreataaga 

34371 gtgcgtatca aaggaaaact aacgacactg agagaagttt cagaaaaata tcacatatct ccagaacttc 

34441 ttagatatag atacaaacat aaaatgcgcg gcgatgaatt atcgtgtgga agaaaagaet caaaatctaa 

34511 agatgaagtt gaatatatgc agagtcaaat aaaagatgaa gaaaaagaga gagaaaaaat cagaaaaaaa 

34581 gcgattttga acctatacca acgaaatgtg agagcggaat atgaagaaga aagaaagaga agatcgagac 

34651 catggcccta tgatggaacg ccacaaaaac attcacgtga tccgtactgg ttcgatgtca cttataacca 

34721 aatgttcaag aaatggagtg aagcataatg agcgtaatca gtaacagaaa agtagatatg aacgaagcgc 

34791 aagacaatgt caagcaacca gcgcactaca catacggcga catcgaaatt acagacttca ccgaacaggt 

34661 tacggcacag catccacctc aactagcatt cgcaataggt aatgcaataa aatacttgcc tagagcacct 

34931 ttaaagaatg gtcatgagga tctagcaaag gcgaagtttt acgtccaaag agcctttgac ttgtgggagt 

35001 gatgaccatg acagatagcg catgtaaaga atacttaaac caatttttcg gatctaagag atatctgtat 

35071 caggataacg aacgagtggc acatatccat gtagtgaacg gcacttatca ctttcacggg catatcgtac 

35141 caggccggca aggcgtgaaa aagacacttg atacagcgga agagctcgaa acatatacaa agcaacatgg 

35211 tttggaatac gaggaacaga agcaactaac tttattetaa ggagatagaa aegatgaaaa tcaaagttga 

35281 aaaaataatg aaaatagacg aattaattaa gtgggcgcga gaaaatccgg agctatcatt tggcagaaaa 

35351 tattatacaa cagacaaaaa cgatgaaaac tttatttact tcggtgttct taaaaattgt ttcaaaataa 

35421 gcgattttat attagttaat gccactttta gcgtcaaagt tgaagaagaa gtaaccgaag aaaccaagtt 

35491 tgataggttg tttgaagtgt acgagattca agaaggagtc cataaatccg caccatatga gaacgctagt 

35561 acaaacgaac gcctaaaaaa cgacagaact tttcttgcta aagcattcta catcttaaac gacgacceaa 

35631 ctatgacgtt aatttggaaa gaaggagagt tgactaaata atggaacacg gttcaaaaga atattacgaa 

35701 aagcaaagtg aacactggtt tgatgaagca agcaagtttt tgaagcaacg tgatgagctt accggagata 

35771 cagctaagtt aagagagtgc aacaaagagc cggagaagaa agcaagtgca cgggataggt ategcaagag 

35841 cgttgaaaaa gatttaaeaa acgaattcgg caaagatggc gaaagagtca aatttggaat ggaattaaac 

35911 aataaaattt ttatggagga agacgcaaat gaataaccgc gaacaaatcg aacaatcagc tactagtgct 

35981 agcgcgtaca acggcaatga cacagaggga ctattaaaag agattgagga cgtgtataag aaagcgcaag 

36051 cgtttgatga aatacttgag ggtttaccta atgccatgca agatgcaatc aaagaagata ttggtcttga 

36121 tgaagcagta ggaattatga cgggtcaagc tgtctataaa tacgaggagg agcaggaaaa tgactaacat " . 

36191 attacaagtg aaactattat caaaagacgc tagaatgcca gaacgaaatc ataagacgga tgcaggttat 

36261 gacatacttt cagccaaaac tgtcgtactt gagccacaag aaaaggcagt gaccaaaaca gatgtagctg 

36331 taagcatccc agagggctat gtcggtttat taactagccg tagtggtgta agtagtaaaa cgcatttagt 

36401 gattgaaaca ggcaagacag acgcgggata tcatggtaat ttagggatta atatcaagaa tgacaatgaa 

36471 acgttagaga gtgaggatac gagtaacttt ggtcggagtc cttctggtat agatggaaaa tacaccctac 

36541 cacctgtaac agataaattc ttatgtacga atggtagtta cgtcataaat aaaggcgaca aactagctca 

36611 actggtcacc gtgcctatat ggacacctga actaaagcaa gtggaggaac tcgagagtgt ttcagaacgt 
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36681 ggagcaaaag gcttcggaag cagcggagcg caaagacaca ttagaccgag ecaaggaggc ttcggggaag 

36751 tgagtgacat gttagaaaca tttctcatag ggtttggtgt ttatctattt tgtcgcatag gtattatttt 

36821 ccccaagage aaaaagacca cacacacaaa cctatacgaa atgetgttga ttgctactat ccttgcgaca 

36891 cccacatctg ctgataaaca ccaaaagacg cataccctaa tagcattttt agtaatgttt tttatgagta 

36961 agctcaaaca agttcaaggg agctatgagg aacgacacaa cacctagtca caacatttaa agacccaaca 

37031 ggacgtaagc atacacacat aactaaagct aagagcaatc aaaggtttac agttgttgat gcggagagta 

37101 aagaagaagc gaaagagaag tacgaggcac aagccaaaag aaacgcagcc atcaaattag ggcagctgtt 

37171 cgaaaataca agggagcgtg ggaaatgact aaacaaatac caagattatc atccctacta gcgatgtatg 

37241 agccaggcaa gtatgtaact gagcaagtat ataetatgae gacggctaat gatgatgcag aggcgccgag 

37311 cgacettgaa aaaatcagag ctgaagtttc atggtaatag ctattatcat ttttgaatta attatattaa 

37381 tgcgcttagc aatagcactg gaggtgtcgt aaatatgtgg attgtcattt caattgtttt atctatattt 

37451 ttattgatct tgttaagtag cacttcccat aagacgaaaa ccacagaagc atcggagtac acgaacgcte 

37521 atcttttcaa gcagttagta aaaaataatg gtgtcgaagg tatagaagat tatgaaaatg aagccgaacg 

37591 aattagaaaa agatttaaaa gccaaagaga ggcgtcggcc tctctgetct atttaaaata atgaaaggag 

37661 ccgaacatgt tagacaaagt cactcaaaca gaaacaacca aatatgatcg tgatgtttca cattcttatg 

37731 ctgctagtcg tttatctaca cattggacta accacaatat ggcttggtct gactttacgc agaagctagc 

37801 acaaacagtC agaaccaaag aagactcaac tgagcacaac aaaatgtcta agtctgaaca agccgatata 

37371 aaagatgttg gcggatttgt cggtggttat ttaaaagaag gcaaacgacg tgctggtcaa gtcatgaatc 

37941 gctcaatgtc aacactcgat atcgattatg ctgcccaaga tatgactgac acattaccea tgtcttatga 

38011 ttttgcatat tgtttatatt caacacataa gcatagagag ataagcccaa gactgcgttt agtgattcct 

380B1 tcaaaacgaa atgtaaatgc agatgagtat gaagctattg ggcgtaaagt cgcagatatc gttggcatgg 

38151 attacttcga egaeacaaee tatcaaccac ataggctaat gtattggcct tcaactagta acgaegcgga 

38221 acttttcttt acccatgaag atctaccttt gttagaccca gataaaatat taaatgaata tgttgatcgg 

38291 actgacacat tagaatggcc aacgtcttca agggaagaga gtaagactaa aagactagca gataagcaag 

38361 gcgacccaga agaaaagccg ggaattgttg gcgcactctg tagagcctat acgacagaag aagctataga 

38431 aactcttatt cctgatttat acgaaaaaca ttctactaac cgttatacct aecatgaagg ttcaactgca 

38501 ggeggattgg tgetaeaega aaacaacaag cttgcctatc ctcatcataa cacggatccc gtaagcggta 

38S71 tgcttgtgaa cagttttgat ttagtacgca cacacttata tggtgctcaa gatgaagacg ctaaaacaga 

38641 taetccggtt aatcgactae ctagttataa agcaatgcag caaagagcgc aaaatgatga agttgttaaa 

38711 aagcaattaa ttaacgacaa aatgcctgat gcaatgcagg atttcgatga aatagtaaat agcgacgatg 

38781 catggtctga gacgtcagaa actaccccga aaggcacttt caaagctage atcccaaaca tagaaattac 

38851 attgcgcaac gatccaaatt taaaaggaaa aatagcattt aatgaatcta caaaacaaat tgaatgctta 

38921 gggaaaatgc catggaataa taattctaaa atacgtcaac ggcaagacgg tgatgatagc agtttaagaa 

38991 gttatatcga aaagatttat gacatacacc acccaggcaa aacaaaagat gccattataa gcgtagcaat 

39061 gcaaaatgcc tatcatccag taagagatca tctaaataaa atatcgtggg atggacataa acgtettgaa 

39131 aagttattta tcaaaeactc aggtgccgaa gacactgaag cgaacagaac aactaccaaa aaggcattga 

39201 ccgctggaat cgctcgagta acggagccag gatgtaaatt tgactatatg cttacacttt. atggtcctca 

39271 aggtgtaggc aaatctgctt tgctaaaaaa aataggtggt gcatggtttt ctgacagttt agtttctgtt 

39341 actggtaagg aagcatatga ggcattacaa ggcgtttggt taatggaaat ggcagaactt gcagctacaa 

39411 gaaaagctga agttgaagct attaagcatt ccatatctaa acaagttgac cggtttcgtg ttgcttatgg 

39481 acactatatt gaagaccccc caaggeaaCg tatcttcatt ggtacaacta acaaagttga ttecttaaga 

39S51 gatgaaaccg gtggaagacg tttttggcca atgactgtaa atccagagag agttgaagtg aactggtcta 

39621 aactaaccaa agaagagatc gaccaaatct gggcagaagc taaatactat tatgaacaag gagaagagtt 

39691 gttcettaac cctgaactag aagaagaaat gcgttcaatc caaagtaaac atactgagga atctccatat 

39761 acaggtatta ttgatgaata tctcaacacg ccaatcccaa gcaattggga agacttaact atctttgaaa 

39831 gaagacgatc ttatcaaggt gatgttgata tgttaccaac aggaaatgta gattacattg aaagagacaa 

39901 ggtctgtgcg cttgaagtgt ttgttgaatg ttttggtaaa gataagggag atagtagagg atctatggaa 

39971 attagaaaga ttcctaacgt cttaagacaa ttagacaact ggrctgtata tgaaggcaat aaaagtggga 

40041 aaactcgatt tggaaaagat tatggtgtac agatagcgta tgtaagagat gaaagtttag aggatttaat 

40111 acaagaaata ttgaacaaat aeacatcttt agacgctgna ccaaacgteg caecaetttt tgagcgacgc 

40181 aacacggtgg tgtaaaaagt aatcgtaggt gttgtaccat ttttggtgat gcaacattga tgcaacaaat 

40251 gacacaacac ctccttccct tctcgctgta aggttcaacc ctgcttgttt ccaatgctgc atcaaattca 

40321 ctataaagtt taaaaagtag tgttagggag taaaggggta taggggtaac cctctaacag ctatttttaa 

40391 aagtttggca agaactgacg caacatcgga acacaaatac aaattttgta tacaaggtga acaaacgaaa 

40461 gaaecgacat cagaaaaaca tfctagtgaaa gagataacaa agctaaatgg attatgttta aaatgggtcg 

40531 cacctggaac aagaggtgta ccagatagaa ttattattat gccagaagga aaaacatatt ttgtagaaat 

40601 gaagcaagaa aagggaaagt Cacacccttt acaaaaatat gtgcatcggc aattcgaaaa cagagatcat 

40671 acagtgtatg tgttatggaa taaagaacaa gtaaatactc ttataagaat ggtaggtgga acatttggcg 

40741 attgatttca aaccacacag ctatcaaaag eatgcaacag aeaaagcgat tgataacgag aaacacggte 

40811 tgttttcaga tatggggcta gggaaaacag caccaacacc tacagcattt agtgaatcgc agttgttaga 

40881 cactaaaaaa atgttagtca tagcacctaa acaagctgct aaagacacat gggttgatga agttgataag 

40951 tggaaccatt taaatcatct gaaagtgtct ttagtcttag gaacacctaa agaaagaaac gacgcattaa 

41021 acacagaggc tgatacctat gtaaccaata aagaaaatac taaatggtta tgtgatcaac ataaaaaaga 

41091 atggccattt gacatggtcg taattgatga acegtctaca ettaaaagtc ccaagagcca aaggtttaaa 

41161 cctattaaaa agaaattacc actcattaat agatttatag gattaacagg aacacctagt ccaaatagtt 

41231 cacaggatct atgggctcaa gtctacttga tagacagagg cgaaagactt gagtcttcat tcagccgtta 

41301 tcgagaaagg tactttaaac caacacatca agttagcgaa catgtcttta actgggagct aagagacgga 

41371 tctgaagaaa agatatatga acgaatagaa gatatatgtt taagcacgaa agcgaaagat tatctggata 

41441 Cgcctgacag agccgatacc aaacaaacag tagtcctacc cgaaaaagaa agaaaagcat atgaagaatt 

41511 agaaaaaaac cacatttcag aatcggaaga agaaggaaca gctgtagccc agaatggggc atcattaagt_ 

41581 caaaaactac ttcaactatc taacggtgca gtctatacag acgatgaaga tgtaagactt atacatgata 

41651 agaagttaga taagctagag gaaattatag aggagtceca aggccaacca acatcattgt ttcataactt 

41721 caaacatgat aaagaaagaa cacctcaaag gtctaaggaa gcaaccacac tagaggatcc aaactacaaa 

41791 gaacgttgga atagcggaga cactaagccg cttatagcac atccagcaag tgcagggcat ggattaaact 

41B61 tacaacaagg tgggcacatt attgtctggt ttggacttac atggccactg gaattatacc aacaagcaaa 

41931 tgcaagaeta catagacaag gacaaaacca tacgaccatt atccatcaca tcatgaccga taacacaata 
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4 2001 gaecaaagag tatataaagc cetacaaaac aaagaaccaa cgcaagaaga atcgatgaaa gctattaaag 

42071 caagaatagc caagcataag taatggaggt acaagacggg aaaggcgtca tacgatatta agccaggaac 

42141 atttaaatat attgaatcag aaataeataa tttaaatgag aacaagaaag agataaatag atcgagaatg 

42211 gagatactta acccaacgaa agaactagac accaacactg tgtatggacc gttacaaaaa ggagagccag 

42281 tcagaacaac tgagttaatg gcgacaaggt tattgaccaa taagacgtta cgtaacccag aagagacggc 

4 2351 tgaagcagtt gaaagtgagt acteaaagte acctgaagat cacaagaaag taataaggtt aaagtattgg 

4 24 21 aataaagaca agaagctaaa gatagaacaa acaggggatg cttgtcacat gcatcgcaat acagccacta 

4 24 91 caatacgaaa gaaccccgtc aaagcgacag cgcatcacgc aggcatcaaa taacatcgtg caaagattgt 

4 2561 gcaaaaggcc tacaaatctg tagtaatacg atagtaccgg aaagatgrat aaagttacct gaaagctata 

4 2631 cgacataaat acatgaggca catcgctaag cggtgtgtct tttgttatgc aatcaaagag gtgcaagaga 

4 2701 cgaccaagca taataacatt tataagcatg gccgtaagtc atatcaatac gattggttct accatccaaa 

42771 agcatggaag aagttaagag agatagcact agatagagat aattatcttt gccaaatgtg ctcacgcgaa 

42B41 gatattataa cagatgcaaa gattgtgcat cacatcattt atgctgatga agatctcaac aaagctttag 

42911 acttagataa tctaatgtca gcttgttaca gctgtcataa caaaatccac gcaaatgaca atgacaaaag 

4 2981 caatcttaag aaaatt'agag ttctaaaaat ttaaataaaa aaattattta aataaaattt tacgcccccc 

4 3051 cgcccatcgg ctcaaaatgt cctttcgccg ggtaccggag aggcc 
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Table 8 



Bacteriophage 3A ORFs list 



SID 


LAN 


FRA 


POS 


a . a . 


RBS 00 queue a 


STA 


STO 


100379 


3AORF001 


1 


a ci c 1 lit o p 
Bala- • 134 o o 


1657 


acaggtaeggat c taagaaaacttt 


ttg 


taa 


1 A n T o o 


3AOKF002 


•) 

m 


J 'DO / • • 4 U1X4 


ate 


tt taaaataatgaaaggagecgaac 


atg 


taa 


100381 


3AQRF003 


1 


111 QO 1^1 4Q 

J 41 00. .J4i47 


03J 


t caaagaaaccgaggcgccaagaac 


ttg 


tag 


100382 


3AOKF004 


-> 
J 


i lien t oita 


OJ / 


gccauutcactagaaaggaaggcgc 


att 


taa 


i nn? oo 
100383 


0 KADOn AC 

jAOKrO U D 


X 




300 


ag aa a aaa g a c ag c c c a agaag aag 


gtg 


taa 


100384 


3AORF006 


1 


133/1 . . 1 /134 


32 / 


CLLtcacccacaggtaggcgaccca 


atg 


taa 


100385 


3AORF007 


*> 
2 


19337 , . 20036 


A O Q 


acgacagTiaaaacaagcrcagggcc 


atg 


taa 


1003B6 


3AORF008 


■i 




4 a 4 


aacgatctagggtaggtgttgacca 


atg 


tga 


100387 


3AORF0Q9 


1 


40720 . . 42093 


ARC 


gcaaacacc uccacaagaacggrag 


gtg 


taa 


100388 


3AORF010 


3 


1 AJA1 1 ji Q 

134 91 . . 1473 o 


die 
415 


gaggcggaccaacgctacagtaaaa 


att 


taa 


100389 


3A0RF011 


2 


2039 . . 3277 


4 ±4 


at t aaaga c a t aat gcg 1 1 aaggag 


gtg 


taa 


100390 


3A0RF012 


2 


4001 • • a209 


4 02 


aaaaaagagaaaaaattaaacgega 


atg 


taa 


100391 


3AORF013 


1 


30379. •31545 


388 


ac 1 1 1 at gaacgcgagaacaaa tgc 


atg 


taa 


100392 


3AORF014 


2 


14738* * 15562 


274 


attatatgggaggtttgactaatta 


atg 


tag 


100393 


3AORFQ15 


3 


3249 . . 4034 


261 


cttgaatt aagaaaa t c 1 1 1 gaaag 


gtg 


tag 


100394 


3 AORF016 


-2 


25507 . « 26273 


228 


aaga ag c c aaga aaaaaa t aaaaa t 


atg 




100395 


3AORF017 


3 


6729 . . 7370 


oil 
213 


ctaattttta aggaggaaac aagc a 


atg 


taa 


100396 


3AQRF018 


3 


oi tin oci ci 
24540 . .25154 




AHVaAAn<.a9fis A^vt» ./.wfc #» ^ am 

aataaaaCaaaaagCaij^Lgataag 


atg 


taa 


100397 


3AORF019 


2 


31565 . • 32123 


IB / 


etataaaaattaaaaaggaeggtat 


at a 


taa 


100398 


3AORF020 


3 


36150 . .36713 


IDT 

187 


gcagcaggaactacgacgggtcaag 


ttg 


taa 


100399 


3AORF021 


2 


24011 . .24535 


174 


gcaataaaattr&taaagaaaggaa j 


atg 


tga 


100400 


3AORF022 


-2 


12423 . .12938 


171 


taaagt accagtagacaat gt aggt 


att 


tga 


100401 


3AORF023 


1 


7462 . . 7917 


151 


aaaat aaat caaaggagaat aa tt t 


atg 


taa 


100402 


3AORF024 


1 


26731. .27174 


147 


actaaataaaaataaggaggacact 


atg 


tga 


100403 


3AORF025 


1 


42106 . .42543 


145 


taagca t aagt aat ggaggt at aag 


atg 


taa 


100404 


3AORF026 


2 


35255 . .35671 


138 


aagcaact aact 1 1 at 1 1 1 aaggag 


ata 


taa 


10040S 


3AORF027 


2 


5888 . .6298 


136 


at at t ggct at aat acagt ggx tt t 


ate 


taa 


100406 


3A0R7028 


-3 


27845 . .28255 


136 


ccttttaagatgtttatgatccttt 


ctg 


taa 


100407 


3AORF029 


3 


34344 . .34748 


134 


ttaaggttttagatttagaggtgga 


atg 


taa J 


100408 


3AORF030 


2 


6299 . .6694 


131 


tat aaaaaaggagt tggecagat aa 


atg 


tag 


100409 


3AORF031 


1 


20833. .21225 


130 


t t aacaaaat t ataggagt gagaaa 


ata 


taa 


100410 


3AORF032 


-2 


39984 . .40361 


125 


aaatagctgttagagggttacccct 


ata 


tag 


100411 


3AORP033 


1 


7957 . . 8325 


122 


gaatatctgcgtcttttttatttga 


ata 


taa 


100412 


3AORF034 


-2 


28506 . .28871 


121 


gt t at caacct aaggaggt gat aac 


atg 


tag 


100413 


3AORF035 


-2 


10671 . . 11036 


121 


tcctagcttcctaacagcaccgcca 


ata 


tga 


100414 


3AORF03 6 


2 


o rt /\ "i ft a n a *"i 

30020 . .30382 


120 


accaat t t t aaggaggagt t aat ca 


atg 


tga 


100415 


3AORF037 


2 


A 4 fx m rt AAft f f 

21818 . . 22165 


115 


aag t gt aag t aa c agr t aagag t ca 


gtg 


tag 


100416 


3AORF038 


-2 


42003. .42347 


114 


gtactcactttcaactgcttcaacc 


ate 


tga 


100417 


3AORF03 9 


2 


21386. .21727 


113 


tccagaaaatctagagtcataggtt 


ata 


taa 


100418 


3AORF04 0 


-3 


29654 . . 29995 


113 


ttgattaactcctccttaaaattgg 


ttg 


taa 


100419 


3AORF041 


-1 


4333 . .4671 


112 


tactaaatctacatctgatccatga 


att 






<3JftUKJFv4 A 




cccb conn 


110 




atg 


tga 


100421 


3AORF043 


1 


25690. .26019 


109 


taccaaattaatatagtcttcgcat 


ata 


tag 


100422 


3AORF044 


3 


29676. .30005 


109 


gtcttaaataattatataaggagtt 


att 


taa 


100423 


3AORF04 5 


3 


30. .353 


107 


cget ageaacgeggat aaat 1 1 1 1 c 


atg 


taa 


100424 


3AORF046 


3 


27694 . .28214 


106 


aagatattgaaaagctaatttcccc 


ata 


tga 


100425 


3AORF047 


-2 


11907. .12227 


106 


tccgccgccaaaatgattagcattt 


ctg 


tga 


100426 


3AORF048 


-3 


40343. .40663 


106 


ccataacacatacactgtatgatct 


ctg 


taa 


100427 


3AORF049 


-3 


6749. .7069 


106 


tgt taaacca t ct t c agat t c tcca 


ata 


taa 


100428 


3AORP050 


1 


42700. .43014 


104 j 


1 1 atgcaat caaagaggtgt aagag 


atg 


taa 


100429 


3AORF051 


-2 


13077. .13388 


103 


t tgt aegt aatcc cacacatcgceg 


att 


tga 


100430 


3AORF052 


-3 


3722. .4024 


100 


gcatctcatttcctcctaacaactc 


att 


tga 


100431 


3AORF053 


3 


17145. .17444 


99 


tcgagacaatggatatagggagtgt 


att 


"9 


100432 


3AORF054 


-1 


19915. .20211 


98 


ataacttatagcttgcgaaacataa 


ata 


_^ga_ 


100433 


3AORF055 


-1 


42436.-42729 


97 


aatcgtattgatatgacttacgacc - 


atg . 


-tag 


100434 


3AORF056 


3 


40455. .40745 


96 


t aaat 1 1 tgt at acaaggtgaat aa 


4fg 


tga 


100435 


3AORF057 


-1 


38665. .38952 


95 


accatcaccgtcttgceattgacgt 


att 


taa 


100436 


3AORF058 


-1 


21265. .21549 


94 


gaaatttctatctaacttgtcataa 


att 


tga 


100437 


3A0RF059 


-2 


10278. .10562 


94 


tttagccgcgcttccaactgcacgt 


att 


tag 


100438 


3AORF060 


1 


5278. .5556 


92 


atatcagccgaataggggtgatgaa 


atg 


tag 


100439 


3AORF061 


1 


35668. .35946 


92 


1 1 1 ggaaagaaggapagt t ga 1 1 aa 


ata 


taa 


100440 


3AORF062 


2 


35912. .36187 


91 


gttaaattcggaatggaattaaaca 


ata 


taa 
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100441 


3AORF063 


3 


36720. .36995 I 91 


cggaagtagcggagcgtaaagacat 


act 


tga 


100442 


3AORF064 


-2 


35694 . .35969 | 91 


ccgttatacgcgctagcactaataa 


ctg 


taa 


100443 


3AORF065 


-2 


32697. .32972 j 91 


aaccgttttcttttgtaaattaggt 


ata 


taa 


100444 


3AORF066 


3 


29157. .29429 | 90 


caaactttaacatttatctaaagga 


gtg 


tag 


10044S 


3AORF067 


-2 


26661. .26930 


89 


atacttttttagcggaatcggatga 


ttg 


taa 


100446 


3AORF068 


-2 


9624. .9893 


89 


ttttaatgcatctcccatgtattga 


ata 


tga 


100447 


3AORP069 


-3 


13847. .14110 | 87 


tgcatttcctcctgattcgtgttga 


ate 


tga 


100448 


3AORF070 


1 


34993. .35250 ) 85 


tttacgtccaaagagcttttgactt 




taa 


100449 


3AORF071 


2 


34745- .35002 


85 


aaatgtt caagaaatggagtgaagc 


ata 


tga 


1004SO 


3AORF072 


-1 


27379. .27636 


85 


tttgtcgttcctcctttaagttgtc 


ttg 


taa 


100451 


3AORF073 


2 


37367. .37615 


82 


tggtaatagctattatcatttttga 


att 


taa 


100452 


3AORF074 


-2 


23466. .23714 


82 


cgtttgtttttttaaaatttaatat 


att 


taa 


100453 


3AORP07S 


-3 


2471. .2719 


82 


agtactgtttgaaatcctctaacac 


ttg 


tga 


100454 


3AORF076 


1 


26047. .26292 


81 


aagt acgc t t t ct tggcggggaggt 


gcg 


tag 


100455 


3AORF077 


2 


28292. .28537 


81 


aacacctcaaaaggaggaataacaa 


atg 


tag 


100456 


3AORF078 


-1 


5836. .6075 


79 


ttttgtataaggcttagatttagtc 


att 


taa 


100457 


3AORF079 


-2 


5460. .5699 


79 


attcagtcgcttttaaaatttctct 


ate 


taa 


100458 


3AORF080 


-2 


31350. .31586 


78 


cctgtaatcactttagfctttattta 


ata 


taa 


1004S9 


3AOR7081 


-3 


8252. .8488 


78 


aagttttcttaaateegtacctgta 


atg 


tga 


100460 


3AORF082 


-1 


35905. .36138 


77 


at at 1 1 a t agacaact tgacccgt c 


ata 


taa 


100461 


3AORF083 


-1 


34039. .34272 


77 


atagttcacctggattattaaataa 


ata 


tga 


100462 


3AORF084 


-1 


12007. .12240 


77 


acat ttttttcat 1 1 cgccgccaaa 


atg 


taa 


100463 


3AORF085 


-2 


32367. .32S97 


76 


cttacaaggtatagagaaataacga 


att 


taa 


100464 


3AORF086 


-2 


30618. .30848 


76 


atataatctaagttgaggattatct 


ata 


taa 


100465 


3AORF087 


-3 


24746. .24973 


75 


ataggttttaagttcaccctcttca 


atg 


tga 


100466 


3AORF088 


-3 


12980. .13204 


74 


tctttctttttcgtaccaccatgga 


att 


tag 


100467 


3AORF089 


3 


4290. .4508 


72 


acaggagaagcttatcaatctttaa 


atg 


taa 


100466 


3AORF090 


3 


28926. .29141 


71 


ttatacacgaaaggagcataaacaa 


at 3 


taa 


100469 


3AORF091 


-2 


13587. .13802 


71 


cttgtcttgctaattgcttagataa 


atg 


tag 


100470 


3AORF092 


2 


26471. .26683 


70 


aaacgaaacaaaaggagggggttca 


atg 


taa 


100471 


3AORF093 


-1 


2524. .2736 


70 


tccaccgttttcttcatagtactgt 


ttg 


tga 


100472 


3AORF094 


-3 


25334 . .25546 


70 


tggcgctttaatataaaagacgtct 


att 


tga 


100473 


3AORF095 


3 


8316. .8525 


69 


aagagat gggaaagacagaagaaca 


ate 


tag 


100474 


3AORF096 


2 


36992. .37196 


68 


aacaagt t caagggagct at gagga 


atg 


tga 


100475 


3AORF097 


-1 


32593.-32799 


68 


aaagcttaatacctctgtcgtttat 


atg 


taa 


100476 


3AORF098 


-1 


15346.. 15552 


68 


aatccattaaatcacctacctataa 


ata 


tag 


100477 


3AORF099 


1 


7225. .7428 


67 


actggtgactggatgaacagaaaag 


"9 


tag 


10047B 


3AORF100 


-2 


22620. .22823 


67 


cgacttcatgaccggcatgtcttaa 


ata 


taa 


100479 


3AORF101 


-1 


40060.. 40260 J 


66 


aacct t acagcgagaagggaaagag 


gtg 


taa 


100480 


3AORF102 


-1 


35035.. 35235 


66 


ttctatctccttaaaataaagttag 


ttg 


taa 


100481 


3AORF103 


-2 


1149. .1349 


66 


at t t t t t tggagt gt tgggt aat ca 


ata 


taa 


100482 


3AORF104 


1 


27661. .27858 


65 


aaacaacttaaaggaggaacgacaa 


atg 


tga 


100483 


3AORF105 


-2 


9420. .9617 


65 


gcct aagt caaccgct t gat t agac 


atg 


tga 


100484 


3AORF106 


-2 


23244. .23438 


64 


caccagtaattcttgaattagttga 


ata 


taa 


100485 


3AORF107 


2 


11966. .12157 


63 


t c t aa aaa aga t get gt agt agac g 


ttg 


taa 


100486 


3AORF108 


-3 


3S054. .35245 


63 


ttttcatcatttctatctccttaaa 


ata 


tag 


100487 


3AORF109 


-3 


16010.. 16201 


63 


gttcttaattccaatgtactgacag 


ttg 


taa 


100488 


3AORF110 


-1 


6184. .6372 


62 


attttcagtgactttataatagtat 


att 


taa 


100489 


3AORF111 


-2 


16500.. 16688 


62 


g t agt caacaat tgc 1 1 tg tat t ga 


ttg 


tga 


100490 


3AORF112 


-2 


8S02..8690 


62 


cttaattctcgcctgatacttttcc 


att 


taa 


100491 


3AORF113 


1 


34162. .34347 


61 


t atgaaggattaggagt gtgattgc 


atg 


tga 


100492 


3AORF114 


2 


12356. .12541 


61 


ggatatcacactaaggctatagcta 


ata 


taa 


100493 


3AORF115 


-2 


763S. .7820 


61 


tgaagtt ccc tcagctacaecgt ga 


att 




100494 


3AORF116 


-1 


26434. .26613 


59 


c 1 1 agct t ctgaagt tgt aaaat ct 


ctg 


tga 


100495 


3AORF117 


-3 


17804. .17983 


59 


atagccattatttctagcttgtgtc 


atg 


tga 


100496 


3AORF118 


2 


27899. .28075 


58 


attgaaaagctaatttccccataag 


att 


taa 


100497 


3AORF119 


-1 


39268. .39444 


58 


a egaaaceggt c aac t t gt t t aga t 


atg 




100498 


3AORF12 0 


-2 


37152. .37328 


58 


tagctattaccatgaaacttcagct 


ctg 


taa 


100499 


3AORF121 


•2 


18900. .19076 


56 


aaggtactctctcccatttaccact 


att 


taa 


100500 


3AORF122 


-1 


21S50. .21723 


57 


taagcatggtaatcacctcctttaa 


atg 


taa 


100501 


3AORF123 


-3 


33062. .33235 


57 


aaacgt tgt t ct 1 1 aat aagat etc 


ttg 


tag 


100502 


3AORF124 


2 


21212. .21382 


56 


aaattagaagaggttaaaggagaga 


ctg 


tag 


100503 


3AORF125 


-1 


22051.. 22221 


56 


aaatcaggattgaactgcttcccta 


atg 


tga 


100504 


3AORF126 


-2 


7821. .7991 


56 


tgtttttcctgttttacggtcttta 


att 


tga 


100S0S 


3AORF127 


-3 


34712 . .34882 


56 


ttgcattacctattgcgaatgctag 


ttg 


taa 


100506 


3AORF12B 


-3 


24056.. 24226 


56 


tttttaaaatcaaagcgtctttgtt - 


ata 


•taa 


100507 


3AORF129 


-3 


4940. .5110 


56 


cataccatgcagttaatacaaacaa 


aba 


«9* 


100508 


3AORF130 


3 


27171. .27338 


55 


cagaattaactatcgatgatttcga 


atg 


taa 


100509 


3AORF131 


-1 


40387. .40554 


55 


ccttctggcataataataattctat 


ctg 


taa 


100S10 


3AORF132 


-2 


1860. .2027 


55 


gcgataacattcacctccttaacgc 


att 


tga 


100511 


3AORF133 


-3 


42317. .42484 


55 


acaaagttctttcgtattgtagtaa 


ctg 


tag 


100512 


3AORF134 


2 


12671. .12835 


54 


tcatacaaatctttaaaaggttgga 


ctg 


tag 
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100513 


3AORF135 


-1 


39484 . .39648 


54 


acaatagtatttagctcccgcccag 


att 


taa 


100514 


3AORF136 


1 


29710 . .29871 


53 


accccacaacaaaaaacaccaccac 


ate 


taa 


100515 


3AORF137 


1 


37186 . .37347 


53 


ggc ag t 1 gt c t gaaaat at aaggga 


gtg 


caa 


100516 


3AORP138 


2 


20996 . .21157 


53 


aatggggaaatagtttttaacgaag 


att 


caa 


100517 


3AORF139 


3 


15114 . .15275 


53 


tcaactgaaatcgaageaagtttaa 


atg 


taa 


100518 


3AORF140 


3 


29442. .29603 


53 


aaaat ggt act aggaggat t at caa 


atg 


taa 


100519 


3AORF141 


-1 


39883. .40044 


53 


tacaccataatcttttccaaatcga 


att 


taa 


100520 


3AORF142 


-1 


20416. :20577 


S3 


accacctggaaaagccccataaaaa 


att 


tga 


100521 


3AORF143 


-1 


1942. .2103 


S3 


ataaagcttagaagttgactgatca 


ate 


taa 


100522 


3AORF144 


-3 


39380. .39541 


S3 


ttccaccagtttcatctcttaagaa 


ate 


taa 


100523 


3AORF145 


3 


20388 . .20546 


52 


tctgagtggtcagaattagctatta 


atg 


taa 


100524 


3AORF146 


-2 


2358. .2S16 


52 


aacatgtccatattatgaacaatca 


att 


tga 


100S25 


3AORF14 7 


-3 


5606. .5764 


52 


gtgatttgtttgtggtagatattca 


att 


tga 


100526 


3AORF14 8 


2 


34145. .34300 


SI 


tttacttctccgttttatacgaagg 


att 


taa 


100527 


3AORF14 9 


-1 


7916. .8073 


SI 


tattctcttgatttactaattctaa 


ata 


taa 


100528 


3AORF150 


-2 


11745. .11900 


SI 


ttcatccttatgtctttgatcagca 


ata 


taa 


100S29 


3AORF151 


-3 


7097. .72S2 


51 


tttaccttcatgatacccgtataca 


ata 


tga 


100530 


3AORF152 


1 


21652. .21804 


50 


ccaaaaatattagagacatcaagat 


gtg 


taa 


100531 


3AORF153 


2 


5381. .S533 


50 


t cggc t aagt c tgaat t act at taa 




tga 


100532 


3AORF154 


-1 


39670. .39822 


50 


ttgataaaatcgtcttctttcaaag 


ata 


taa 


100533 


3AORF155 


-1 


38233. .38365 


50 


ataggctctacaaaatgcaccaaca 


att 


tag 


100534 


3AORF1S6 


-1 


33040. .33192 


50 


tatctgaaatataatgctttgttaa 


att 


tag 


100535 


3A0RF157 


-2 


10119. .10271 


50 


cttcaatgacttgctatagctatta 


att 


tga 


100536 


3AORF15B 


-3 


36074. .36226 


50 


atccgtcttatgatttcgttctggc 


att 


taa 


100537 


3AORF159 


-3 


18338. .184 90 


50 


taaatagtttctactatttggatta 


ctg 


taa 


100538 


3AORF160 


3 


39399. .39548 


49 


gt 1 1 ggt t aa tggaaat gg cagaac 


ttg 


taa 


100539 


3AORF161 


-2 


8976. -912S 


49 


ttgtacttttagtttttgaacttga 


"9 


tga 


100540 


3AORF162 


-3 


31199. .31348 


49 


tctgtaatatcttcaggtttataac 


ctg 


tga 


100541 


3AORF163 


-3 


14459. .14608 


49 


attatcctgagaagaaacagtttga 


ate 


tga 


100542 


3AORF164 


3 


25182. .25328 


48 


tcttttcttagctttttctgataaa 


gtg 


tag 


100543 


3AORF165 


3 


28353. .28499 


48 


aat c t t gt ct c tatgacacggaaag 


att 


taa 


100544 


3AORF166 


-1 


8899. .9045 


48 


gtactgcgtcacttgctctttttag 


ttg 


taa 


100S45 


3AORF167 


-2 


411. .557 


48 


taatacaagttgacgtttagatcct 


ttg 


tga 


100546 


3AORF16B 


-3 


25973. .26119 


48 


gctgagt act t caat gtgaagat t a 


atg 


tag 


100547 


3AORF169 


-3 


25151. .25297 


48 


aaaaaaacgcctacaagtgtagacg 


"3 


tag 


100548 


3AORF170 


-3 


24995. .25141 


48 


taagaaaaaagattagtattcattc 


att 


tag 


100549 


3AORF171 


1 


23437. .23580 


47 


aaaggtaataacgtaagggacggct 


att 


tag 


100550 


3AORF172 


2 


32414. .32557 


47 


ctatttgaccctgctgtaaaaaagt 


atg 


taa 


100551 


3AORFX73 


-1 


38005. .38148 


47 


ataagttgtatcategaagtaatce 


atg 


taa 


100552 


3AORF174 


-1 


4123. .4266 


47 


atttaaagattgataagcttctccc 


gtg 


tga 


100553 


3AORF175 


-1 


3124. .3267 


47 


ttcatttgaaaatacttagctttca 


ttg 


tag 


100554 


3AORF176 


-1 


580. .723 


47 


cat 1 1 1 ct ccat c t tgtga t acagc 


ata 


taa 


100555 


3AORF177 


-2 


39819. .39962 


47 


ttagaaatctttctaatttccatag 


ate 


tag 


100556 


3AORF178 


-2 


38466. .38609 


47 


t t agcgt c t t cat ct tgagcaccat 


ata 


tag 


1005S7 


3AORF179 


-2 


33927. .34070 


47 


ttttgcccaatcaacaggcttattc 


atg 


tga 


100556 


3AORF180 


-2 


33555. .33698 


47 


cgtcttt cgggattttacagtat t a 


att 


tga 


100559 


3AORF181 


-2 


29538. .29681 


47 


atagtattttttgttgtaaggtcat 


att 


tga 


100560 


3AORP182 


-3 


17099. .17242 


47 


aatatcactactgcctgcataaggt 


ate 


tag 


100561 


3AORF183 


2 


23750. .23890 


46 


ttaaaaaaacaaacgtttttagtat 


ata 


taa 


100562 


3AQRF184 


-1 


31648.. 31788 


46 


cggaagc t t cagact t gcaggaact 


ttg 


tga 


100563 


3AORF185 


-1 


30S65.. 30705 


46 


attttgtttcaaataaagctattac 


ate 


tag 


100564 


3AORP186 


-1 


16951. .17091 


46 


gagaattcaaagtactagtgtataa 


atg 


tga 


100565 


3AORF187 


-1 


7153. .7293 


46 


tatccaacgaataettttttgaaga 


att 


taa 


100566 


3AORF18 8 


-1 


1237. .1377 


46 


ccagctcttctaaagaaacaattee 


att 


taa 


100S67 


3AORF189 


-2 


33309. .33449 


46 


catttgagaagccgatgcttcatat 


ate 


tga 


100568 


3AORF190 


-2 


7197. .7337 


46 


gtaacgaacttgcagaatcctctga 


atg 


taa 


100569 


3AORF191 


-3 


41459. .41599 


46 


ccatccgtataaactgcaccgttag 


ata 


tag 


100570 


3AORF192 


3 


4863. .5000 


45 


gat get at t a 1 1 aacgc 1 1 1 agcag 


att 


tag 


100571 


3AORF193 


3 


25965. .26102 


45 


tatacgatactagtttagactcttt 


ata 


tga 


100572 


3AORF194 


-1 


37069. .37206 


45 


ctagtaagaataataatcttagtat ^ 


ttg 


tga 


100573 


3AORF195 


-1 


11749. .11886 


45 


cttgatcagcaatagctaataattt 


ate 


tga 


100574 


3AOR7196 




40764. .40901 


45 


atctttagcaacttgtttaggtgct 


atg 


tga 


100575 


3AORF197 




31969. .32126 


45 


ggct aaaaacc ccacct a t tgact t 


ata 


tga 


100576 


3AORF198 




36431. .36568 


45 


tttatttatgacataactaccattc 


ata 


tga 


100577 


3AORF199 




33515.. 33652 


45 


ctccaaaaattaactatgttaggat 


ttg 


tga 


100578 


3AORF200 




21233. .21370 


45 


ataagattataacctatgactctag - 


att_- 


►-tga 


100579 


3AORP201 


1 


23293. .23427 


44 


aagect at eggt ggt gt gat at c t a 


g*g 


taa 


100580 


3AORF202 




39088. .39222 


44 


at agt caaat t tacat cc tggctc c 


att 


taa 


100581 


3AORF203 




16309. .16443 


44 


tttgcttgccgtctaaaatcaactt 


ata 


tga 


100582 


3AORF204 


1 


23845. .23976 


43 


atgtt t at t at caat caaaatat aa 


att 


taa 


100583 


3AORF205 


1 


29500. .29631 


43 


gt gt tgt gc 1 1 caeggt ct t agega 


ttg 


taa 


100584 


3AORP206 


2 


16667. .16798 


43 


gaaaaatcaacagtcttaaatttaa 


ctg 


tag 
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100585 


3AORF20 7 


-1 


35386. .35517 


43 


tgeagatttatagactccttcttga 


ate 


taa 


100586 


3AORF20 8 


-1 


30013. .30144 


43 


cagttgagctgtttcatcttttggc 


att 


taa 


100587 


3AORF20 9 


-1 


28366. .28497 


43 


taattcctggtctctagttgggttt 


aca 


tga 


100S88 


3AORF210 


-1 


15739. .15870 


43 


catcaagcttatttgattccactga 


gtg 


tag 


100589 


3AORF211 


-1 


7693. .7824 


43 J 


taactgaagttccctcagctacacc 


gtg 


tga 


100590 


3AORF212 


_2 


4314. .4445 


43 


ggttctgaaacaatttctttagaaa 


gtg > 


tag 


100591 


3AORF213 


- 2 


4011. .4142 


43 


tgtttgatgtcttccatatcaatat 


ttg 


taa 


100592 


3AORF214 


-2 


1722. .1853 


43 


tctgtctagtttcaaccgaacatta 


ttg 


taa 


100593 


3AORF215 


.3 


16616. .16747 


43 


tcttcatttgtctgcgtattagcat 


ate 


tag 


100594 


3AORF216 


.3 


15833 . .15964 


43 


gtcattttgaccgaagttttttgat 


ttg 


taa 


100595 


3AORF217 


3 


6363 . .6491 


42 


gacgcagagct ccaaacat a t at aa 


att 


taa 


100596 


3AORF218 




32146. .32274 


42 


aataagctataattaagatttcgaa 


ate 


taa 


100597 


3AORF219 


_j_ 


29800. .29928 


42 


ctagggtcatcactttgttcgttta 


ate 


taa 


100598 


3AORF220 


_ L 


18409. .18537 


42 


gcattaacctgatacgcttcttctc 


ctg 


tag 


100599 


3AORF221 




13234. .13362 


42 


ttctatcgcctaaccaagatgcacc 


ate 


tag 


100600 


3A0RF222 




12313. .12441 


42 


cccaagctttatctgaggcatctga 


ata 


tga 


100601 


3AORF223 


_ t 


4915. .5043 


42 


tccatcatagttaattccaaaataa 


ttg 


taa 


100602 


3AORF224 


wl 


2125. .2253 


42 


attaactactttataatcttcatac 


att 


taa 


100603 


3AORF225 


.2 


26298. .26426 


42 


tcgtttgtaacaacttgatttaaga 


ata 


taa 


100604 


3AORF226 


_2 


17184. .17312 


42 


cgcctatttttaaattatctaattt 


att 


tag 


100605 


3A0RF227 


. 2 


1425. .1553 


42 


atcttcttcccattctctatagggt 


att 


taa 


100606 


3AORF228 


"3 


31055. .31183 


42 


cattttttgatgtcaggcagtttat 


ata 


taa 


100607 


3AORF229 




22592 . .22720 


42 


gttataaccatgaccggceacaagc 


ata 


taa 


100608 


3AORF230 


"^1 


27883. .28008 


41 


gaaggcagggtegtttcttggatta 


ttg 


tag 


100609 


3AORF231 


-2 


29988.. 30113 


41 


gcttctttaactttctcttgtacaa 


ttg 


taa 


100610 


3AORF232 


-2 


22485. .22610 


41 


tat ctgggaaat 1 1 aa t ct aataaa 


ata 


tag 


100611 


3AORF233 


-2 


9264. .9389 


41 


aagtttgccgaaatgactttgagct 


ate 


tga 


100612 


3AORF234 


-3 


23033. .23158 


41 


acctaattcagataagcgataattt 


ata 


tga 


100613 


3AORF235 


1 


25558. .25680 


40 


aacactgctgaaatagacgtctttt 


ata 


tag 


100614 


3AORF236 


1 


34420. .34542 


40 


acattgagagaagtttcagaaaaat 


ate 


taa 


100615 


3AORF237 


3 


38442. .38564 


40 


gaagaagctatagaaacttttattc 


ctg 


taa 


100616 


3AORP238 




33628.. 33750 


40 


caatcattagaaaaccttttttcat 


ata 


taa 


100617 


3AORF239 


, 1 


29248. .29370 
18156.. 18278 


40 j 
40 


tcttctaatttagaaatattaatca 
gtctctcaattctgtatagaatttt 


atg 
att 


tag 
taa 


100618 
100619 


3AORF240 
3AORF241 


_2 
_ 2 


8088. .8210 


40 


tttcaaggcttttgtataagtttta 


gtg 


tga 


100620 


3AORF242 


_ 3 


39149. .39271 


40 


ttagcaaagcagatttacctacacc 


ttg 


taa 


100621 


3AORF243 


. 3 


23558. .23680 


40 


aaaattaactgtttattaattttaa 


ata 


taa 


100622 


3A0RF244 


_ 3 -1 


1697. .1819 


40 


cat 1 1 cat t aaaggat t at t at t aa 


ata 


tga 


100623 


3A0RF24S 


1 


19015. .19134 


39 


agttatgcaaggaatatgatgactt 


ttg 


tag 


100624 


3A0RF246 


1 


22504. .22623 


39 


gctaatctaaacactttcacatcgt 


ttg 


taa 


100625 


3AORF247 




40567. .40686 


39 


aaagtatttacttgttctttattcc 


ata 


taa 


100626 


3AORF24 8 




23958. .24075 


39 


tttagattcatgaaacgaagtagca 


ata 


taa 


100627 


3AORF24 9 


~"Tx — 


11113. .11232 


39 


cacctttccccaacacttttacagt 


ate 


tga 


100628 


3AORF250 




8719. .8838 


39 


1 1 1 1 a 1 1 age 1 1 ctact agct ttaa 


ata 


taa 


100629 


3AORF251 




16899. .17018 


39 


aact cgt ctgtc aagege t tgt t ga 


att 


tga 


100630 


3AORF252 


—75 — 


37025. .37144 


39 


acaactgccctaatttaataactgc 


att 


tga 


100631 


3AORF2 53 




29138. .29257 


39 


tctacatactccaaacaattgatgg 


att 


taa 


100632 


3AORF254 





15476. .15595 


39 


caaat caat t cat taaaa t ccat t a 


ctg 


taa 


100633 


3AORF2S5 


1 


13552 . . 13668 


38 


ttaatagacaaagtaaaatcgtggt_ 


ttg 


tag 


100634 


3A0RF256 


2 


12545.. 12661 


38 


aaaagtgcaaagggctggctaacgg 


ata 


taa 


100635 


3A0RF257 


2 


41870. .41986 


38 


gggcatggattaaacttacaacaag 


gtg 


tga 


100636 


3AORF258 


3 


10827. .10943 


38 


t caaac t t t t gaaaaacggt t t agg 


att 


taa 


100637 


3AORF259 




34570. .34686 


38 


gtgacatcgaaccagtacggatcac 


gtg 


tga 


100638 


3AORF260 


_ L 


32389.. 32505 


38 


aagcaggt aagccaat aegcattga 


att 


tag 


100639 


3AORF261 


. x 


23830. .23946 


38 


cctttttaacttttaataaaattaa 


ata 


tga 


100640 


3AORF262 


. x 


8158. .8274 


38 


ccatctcttctggttcagtttctga 


ate 


taa 


100641 


3AORF263 


. 2 


14001. .14117 


38 


ttatacctgcatttcctcctgattc 


gtg 


tga 


100642 


3AORF264 


_ 2 


294. .410 


38 


tttgcttgtttttattttcccttga 


gtg 


taa 


100643 


3AORF265 




42683. .42799 


38 


tgacaaagataattatctctatcta 


atg 


tga 


100644 


3AORF266 


T3 — 


31979. .32095 


38 


aatcctcatcatcagtgtctaattc 


ate 


taa 


100645 


3AORF267 


.3 


26306. .26422 


38 


t tgt aacaac 1 1 ga 1 1 1 aagaa t ac 


ate 


tga 


100646 


3AORF268 


-3 


16490. .16606 


38 


tacatacaaggcttagcctttttat 


ttg 


tag 


100647 


3AORP269 


-3 


9872 . .9988 


38 


tgagacccctctaaccctgagttag 


ata 


tag 


100648 


3AORF270 


1 


21829. .21942 


37 


a t agt taagagt cagt gc t c eggea 


ctg 


tag 


100649 


3AORF271 


2 


29468. .29581 


37 


tgagcgacacatataaaagctacct 


att - - 


taa. . 


100650 


3AORP272 


3 


2955. .3068 


37 


gagttaaacagattttacttgcagc 


ata — 


taa 


100651 


3AORF273 


3 


5010. .5123 


37 


tttggcaaaccagtagtatttacag 


at^ 


taa 


100652 


3A0RF274 


3 


19956. .20069 


37 


tcaagtatagatgaattaaagcaac 


ttg 


tga 


100653 


3A0RF275 


3 


39883. .39995 


37 


gatatgttaccaacaggaaatgtag 


att 


taa 


100654 


3AORF276 


-1 


27211. .27324 


37 


attaagtgegcttatttaattagat 


att 


tga 


100655 


3A0RF277 


-1 


13516. .13629 


37 


egacegt cat t aaagt t aagt ccac 


ctg 


tga 


1006S6 


3AORF278 


-1 


11893. .12006 


37 


ttttatatacacgaccactggataa 


ate 


taa 
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100657 


3AORP279 


-2 


17535 . .17648 


37 


tttgtaaagatetgtttactgctgc 


ttg 


taa 




3AORF280 


-2 


6474. .6587 


37 


ccaaaacaagcacctaaccgactag 


acg 


taa 




■?AQRP2S1 


-2 


759. .872 


37 


tcctgatatcgctgcgccataatgg 


att 


tga 




3AORP282 


-3 


36608 . .36721 


37 


cccaaaacctccctgactcgatcta 


at a 


tga 


i nnc£ 1 
lUUooJL 




-3 


14960 . .1S073 


37 


1 1 1 cage c gaagaac cat c c 1 1 1 aa 


att 


taa 




Tins P2R4 


1 


18859 . . 1B969 


36 


atgctaacagagccaggtccttact 


att 


taa 


iuuob J 




2 


8237 . ■ 8347 


36 


aaaactcatacaaaagccttgaaag 


ata 


taa 


100564 


JAUKr 4SO 


3 


5157. .5267 


36 


tatgatcagcaacgtacattagaca 


gtg 


tag 


1 A AC £. c 




3 


3B610 38720 


36 


tttgatctagtacgcatacacttat 


atg 


taa 


i nnccc 
iQOooq 




_ i 


36454. 36564 


36 


tttacgacacaaccaccattcatac 


ata 


tga 


iuuob < 




. i 


30217 30327 


36 


aacaattttttcacaacgcccttct 


ttg 


taa 


10066S 




. 2_ 




36 


gcttttttgcaaattctaacagctt 


ate 


tga . 


lt/vOu J 


3AORF291 


-2 


14310. .14420 


36 


gc ct agr t aaagggat aaccat etc 


ctg 


tga 


1UUO / u 




-2 


114S7 11567 


36 


ttctttcaattctttgattttctga 


ttg 


tga 


lUUo i ± 




_ 3 


29462. 29S72 


36 


1 1 cat aaaagt at t cct t at aaaat 


atg 


tag 


iuub / 




_3 


22388 22498 


36 


accattccaattttggccaaacgat 


gtg 


tag 


100673 


1 &AD0OOC 


_ 3 


iOD*2 • ( i,0 / J J 


36 


a aaaqaaa c q c c t c 1 1 qagt qaag t 


att 


tag 


iyuo /* 


JAUaT X J? d 


.3 


6332 . . 6442 


36 


t at cagacatgaagtct gaaggtaa 


ate 


taa 


100673 






13984 14091 


35 


aaatggttgaagtcacttaaaggta 


gtg 


tag 


10067b 


i jnooi act 


i 

4 


40174 40281 


35 


tatcaaatgttgcatcattttttga 


gtg 


taa 


iuub / / 




2 


1481 . . 1586 


35 


gccgcgtgtgct act 1 1 tgcgt t ag 


ata 


taa 


1UUO / 0 




2 


40451. .40558 


35 


aat at aaat t t tgcat acaaggtga 


ata 


tag 


lUUt / J 




3 


25479. .25586 


35 


accactagttaaaacttcatatact 


ata 


taa 


lUUODU 




3 


32106 . .32213 


35 


gaagat gat 1 1 cgatgaat t agaca 


ctg 


tga 


1 ftftcfll 
1UUDB1 


3aOR»3Q3 


3 


36024 . .36131 


35 


gacacagagggattactaaaagaga 


ttg 


tag 




lino 177 rtd 




37762. 37869 


3S 


accgacaaatccgccaacatcttte 


ata 


tga 


i nnco 3 

IUUdBJ 


3aorp305 


-1 


24086 . .24195 


35 


tttatccttaacaaaatcaaactga 


ata 


tga 


100684 


jauat juo 


_ ^ 


19507. 19614 


35 


atcattaggt aat tgaaat 1 1 caaa 


ata 


tga 


1006D9 






160A1 16188 


35 




ate 


tag 


100686 


J AUHr J Uo 


i 


ll"49fl 11505 


35 




ttg 


taa 


100667 


JAURr JU7 


_ -> 
* 




35 


aaacagacct c 1 1 acccgt t cat ca 


ctg 


taa 


100688 


-j » t-> ^ « f\ 


-2 


)iao4 ?<;noi 


35 


ataaateaaaatcqctaccaqctqa 


att 


taa 


100689 




4 


32005 22112 


35 


ttcgtaggtgtcattacttctttaa 


ttg 


tag 


1 aa/ on 

100690 




-2 


51711 21A1S 


35 


aaaataaaaagccagtgccgaagca 


Ctq 


tag 


100691 


1&AB911 1 


_ ^ 


1T9Q1 1800B 


35 


ca t t aggt c t t agacgac t t ag c a t 


ata 


taa 


100692 


3AORr314 




40 /Xv. iXOffX/ 


35 


t aa 1 1 ca g t c 1 1 a gga gt a t ca 1 1 1 


att 


tag 


100693 


3AQRF315 


*2 


Xj7'V» »1BWJ ' 


35 


acat at ct ccgt a t cat ttgggtaa 


att 


taq 


100694 


9KABV11 £ 

3AORF316 


-2 




35 


aattcttcttcacactqtttqacqa 


ttq 


taq 


10069S 




J 




35 


tccctaacactacttttta&acttt 


ata 


tga 


100696 


3 AO Kr 3 10 


3 


77»;15 37642 


35 


tgttcggctcctttcattattttaa 


ata 


taa 


100697 


"t ft r\n t> 1 1 a 


-3 


J44iX< .Jf 94B 


35 


ttcttcatcttttatttgactctgc 


ata 


tga 


100698 


3AORP3ZU 


"3 


7a?£3 1R3K9 
^0«D^. iiDJB7 


35 


catttgxtggtaatatcttagttcg 


atq 


tga 


100699 


3AORF3Z1 


1 


4JDB9. .X*UJJ 


34 


e&aaaaoacttaatatiaaaaatqca 


ata 


tga 


100700 


3 AO Rr 32 2 


1 


34QOU. i J»'B* 




aaaaaaaaattaaqaccataactt t 


atq 


taa 


100701 


3AORF323 


3 


TAl AC 




rt-aaata ctaaact atcaae tat aa 


att 


taa 


100702 


3AORF324 


3 


JUisB. . 


34 


aaaaa aaacrt t cc 1 1 aaaaaaac aa 


ata 


tga 


100703 


3AORF325 




4Ui4B . .tUJlU 


34 


gttgtatcatttttggtgatgcaac 


att 


taq 


100704 


3AORF3Z6 


"1 


1COC4 TlfltO 
JB7QV • » J 'VOO 


34 


cgcatcaacaactgtaaacctttga 


ttq 


tga 


100705 


3AORF327 


-1 


39*4* • 


34 


attttcatctat totataatatt tt 

awwwww^www^jw W]JVBWIM»U»W WWW 


ctq 


taa 


100706 


3AUKr 3 * a 


~ 1 


22020 

» i44V4U 


34 


ccatttaccttcttgagatgttgga 


ttg 


tga 


100707 


3AORr329 


'1 


1 0020 1 R924 
XBB*U . iXD74i 


34 


g^tggc 1 1 aact tccaagaaccaac 


ctg 


taa 


100708 


-> ■ aq0^ 9 A 


-1 


^ ecii 1 stic 

13P J A • • X.9 'J5 


34 


ttatgaagttttcacaaattagtaa 


ate 


tag 


100709 




~ 4 


174QB 3S102 


34 


t t acgcccaat age t tea tact cat 


ctq 


tag 


100710 


t&ABfft 15 
JAUXJfjJX 


* 


7463 


34 


tctacaaaccttcaaagttttagtc 


ata 


taa 


100711 


JAUKJr JJ4 




3458.4 24688 


34 


aaaaat t at aaaac t at aaaaccat 


ate 


taa 




JAUKr 334 


* 3 


942£9 24373 


34 


tatttttaggtagataatttattaa 


ate 


tga 


100713 


3ADRF33S 


-> 

"3 


14773 14377 
144 / a. * X%3 f 1 


34 


cacttcagcaagttgatgctttgta 


ate 


tga 


100714 


-] "IADP1 If 


£ 




33 


gtaactttatctaatttagaagegg 


ata 


tag 1 


100715 




* 


13377 13378 


33 


aatataggxaaaaaagcaggagaat 


ttg 


tag 


100716 


3AOR7338 


3 


9501. .9602 


33 


taggacgtacgatgacgatgggcgt 


ate 


taa 


100717 


3AORF339 


3 


27348. .27449 


33 


atatctaattaaataagcgcactta 


att 


tga 


100718 


3AORF340 


-1 


37372.-37473 


33 


ttctatggttctcatcttatgagaa 


atg 


taa 


100719 


3AORP341 


-1 


33421.. 33522 


33 


aagct aat t eggacac 1 1 1 1 cct 1 1 


ttg 


taa 1 


100720 


3A0RF342 


-2 


29047. .29148 


33 


tttggcatctctatcactcctttag 


ata 


taa 


100721 


3AORP343 


-1 


7549. .76S0 


33 


atgat acgcctgagac tagaa t tgg 


att- 


taa . 


100722 


3AORF344 


-1 


7297, .7398 


33 


ctgctqaaactgttgcagattttga 


att — L 


"tga 


100723 


3AOPJ734S 


-2 


23850. .23951 


33 


ttaaacctttttaacttttaataaa 


art 


taa 


100724 


3AORF346 


-2 


20607. .20708 


33 


aaaqa tgtacgact agat 1 1 agt t a 


ate 


taa 


100725 


3A0RP347 


-2 


14175. .14276 


33 


atctqttgttaaagaacgctaataa 


ctg 


taa 


100726 


3AORF348 


-2 


6984. .7085 


33 


cgtacactggttgacctgttaaacc 


ate 


tag 


100727 


3AORP34 9 


-2 


6882. .6983 


33 


tagaacgaccaataactgtatttag 


ate 


taa 


100728 


3AORP3S0 


-3 


40748. .40849 


33 


aactgeaa 1 1 cact aaatgctgt aa 


I gtg 
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1S9 



100729 


3AORF351 


-3 


38345. .38446 


33 


ggccagcagaaegtttttcgtacaa 


ate 


caa 


100730 


3AORP352 


-3 


38081. .38182 


33 


eagttgaaggccaatacatcaacct 


atg 


caa 


100731 


3AORF353 


-3 


35432. .35533 


33 


cagcattctcatatgacgcagattt 


ata 


caa 


100732 


3AORF354 


-3 


34952. .35053 


33 


ctatcctgatacagatatctcttag 


acc 


caa 
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Table 9 



Bacteriophage 96, complete genome sequence 



1 cacagttaca ggcttttcag ctatatacca 

71 gaaaccttga tttaatgggg ttttaaccta 

141 cgttgacctt gctctttttt atgttcatca 

211 aatggcctaa tcttttgcta atatattcaa 

281 tcttaatgaa taaggtgtta ttgtagtatc 

351 ttaacggcat tatgactcaa tttaaacaac 

421 taatatgttg tatatccttt tttggtacct 

491 atgtattgta ccctcttttt cgttt agate 

561 gtgatagcta ggatgaataa aaaaatataa 

631 agtattgttc tatggtgatg aatttagagt 

701 atagctaggg tctttcttca aatagccctc 

771 aatttacgaa ccgtctcatfe agtacgacct 

841 tgatgttett tattaaaaaa tcactcccga 

911 gaattgttgt gaagegacat gtttcttatt 

981 ttattttcat ctaaattgtt tccatcatcc 

1051 ctttagtttt gaatcctgac tttcttttct 

1121 agatgctgtt gctttattct tcctttttgt 

1191 ggcaaaaaat aataagggta ggegagctae 

1261 cttttcctac ttcttctcta aaactatcat 

1331 tccagcatgt tggtttttgt ceggattatt 

1401 tegtaactag gttcgtttgg gtcgcgtggt 

1471 gtacctgttg cttagatgtg ttattggttt 

1541 ttgattattg ttatcgtttt gattactatt 

1611 ttgtctttgt tctctttctt tgtttcggtt 

1681 atgcacctaa caccaacgca ctagctaata 

1751 tgctatttgt tttaataaat ctatgatttc 

1821 tegtctaaca tctctattaa gacgaaaett 

1891 taggattaga aaacgaacta ctgaaacgcg 

1961 taacatatct ttaccgctct cagacattgt 

2031 tattttgttt cctgatttct ttcgatttct 

2101 tatcacgttt ctcagaaact gacatacgat 

2171 ttttceggea gtccaagact ctttaactgt 

2241 ccttttctca tatttcttta tatttaaaaa 

2311 agttccaata cegtatatet tcttatattg 

2381 tactcagaca actcatacaa gttacgtacg 

2451 ctgagataaa gccgtgtcgt ettgegtaat 

2521 gttgecatae gtcaacttgt ggtgggcaag 

2591 gaaggtexaa taaaaatttc tccttcttga 

2661 tcacttcaac ttcacatttc ataagcaatt 

2731 tttctttcta tctctaaccc attgeataaa 

2801 ttatttgeat gaeeggctat agtttcttga 

2871 taaagtaatc tgctaattgt tggacttttg 

2941 tgttgattga cttaccccga ttgettcaga 

3011 tctaagttct ctgataaaat ttttctagca 

3081 gtaatactaa tttaccataa gtaatatcac 

3151 tttaggtgtt gacatattac tttaagtgat 

3221 cagaaaattt taaagagttc tctgtaaagg 

3291 tgataaatta ggcgttacta aacaatctgt 

3361 caattgtatg etttagecaa attattcaac 

3431 aatatcactt taagtgataa aggaggaaac 

3501 ccagtaagga aaattgaagt ggaaggagaa 

3571 atgeacgage agataacgee ataegcaate 

3641 gtcaggtcaa aacagaaata tgatcatcat 

3711 aaacaaagta aaaacgaaaa cattagagaa 

3781 taccgaegtt aagaaaaact ggtgcttacc 

3851 tgaagctaca gaagaaacaa aacaagaaat 

3921 caaaaactgg atgegggaga ctacaatttc 

3991 gaetacatge gataacaaac caaaaacaac 

4061 gatgactggt gcgagttcaa gaacgaacgt 

4131 aattggttcc cgtcacaagc tactttatac 

4201 aggagaggct gaatatggaa tacateggat 

4271 aaaagatgat ctagagaaaa aagtctactc 

4341 cgaggacaaa agegttatat aaaaattgac 

4411 aatacgaatt ataggaggag ttatcaaatg 

4481 tcacagtctt agcgattgta ettatgeegt 

4551 aagtatcgea acattcatat actacaaaga 

4621 caagtaacag tgacaaacac etaccaaaat 

4691 tatggctgaa aatattaaaa ctgaacaaca 



agataagact tatcccgccg tctccataaa aatatgcctg 
gcaagtgtca aatatgtgtc aagaaaataa ttttctgaca 
agcaagtgag agtaggtgtc taaagttata gacatattac 
taggtatacc tttagaaagt aggaaagatg tacgcgtgtg 
atttagtccc atttgaccct tagcatggtt aaatgacttt 
ttactatctg tacgttttgg taattttgat aatttagctt 
ccacaagtct gtccgegrta acegtctttg ctccacgaag 
gataggcaac atattaatta catcgctgta tcttgcacca 
etcgattegt ctctagattt aaagtatcct ateaattgea 
gttcgtcttt tgattttttt gtaccacgaa tatqtatttg 
atataccgea tccctgaagc attgtgataa acaactgttt 
cgaccgaatt cgttcaaaaa cttttgatac tecgaaegtt 
aatattcgtt aaataatttt aacgaaegtt gataccaata 
ttttgaatct aaccaaccat tgtaatattc ttcaaacttt 
aaatctctaa gcagttgttg agcagcgttg gttgcctcag 
ttcctgattt gaaagacgga tgttttacgt cgtactgcca 
aattgtaaat gacgccattc tacttttcct cctcaaaatt 
ccgaaatttt attgttgaac aactattget tcacctcttg 
atgattgact agggtgrgtt aacgacactc ctggaccacc 
ttccatttct tcagtggctc ttttagcatt taaatattct 
tgtgcttgtt gtccattatt ggtagctgga agattcttct 
gttgattgtt gttaatgttt gtgttgttct cgttgtttac 
ttcttttttc gettetgett tatetttagt ttctttcttt 
ttcttgcttt cctctttctt atcgccgtcg ttgctaccgc 
ataaaactaa taatcttttc atgttttaca ctcctttatt 
attgttttgt tctatgattt tgttttcatt tttaagatgt 
tgatttatca tttegtaage aaacatttga cctgtgttgt 
ttgaaaagct atctataaac tgaccaactt tattttttaa 
acttagttcg cgcttattta aagttttttc tacaattttg 
tctacttcaa aagggatatt gttattaaat ttttcgataa 
caaatacttg tttttgacct ttatttaact tccctcgaat 
taacttatca ttaggaactt gattcatctt ttat&tgact 
ctctcaacgg ctcaaatgta atcgaatact cgccatagtg 
ttctattgee tccaatatgt attcttcget taattgtaga 
ccataattgt aagcttctac aatttcgcgt aacgggactg 
tttcgaactt gcgattgttg aatttcgatt gatctaaaat 
ttcttcatat aatacttcta atttgttcct ttcggataag 
taccaaccat cgaatcctcg aggtactctt tgtgtttctt 
ettegtattt tcccacgcgc caaacccctt tggtgtctta 
attttcgatt tcttcccatt ettegggagt aaattcatct 
tgaatacttc tttcttxtgt aattctcgat ttaggtacat 
atattctagg atatttaagt tetttaagee agttagagat 
caattctact tgagtaatgt tgxtctcttt cataagttgt 
ctcttatatt ccataatttt ctcctttagt attacttaat 
ttttcaatac aaaatattac ttttttgaaa taaatatcac 
agtatagttg taaatgtcaa cgggaggtga tacgaaatgc 
tctggagaac taactcgaat atgacacaac aagatgtcgc 
aataagatgg gaaaaagatg aegcagaatt aaaaggctta 
acagaagttg attatataaa ggctaaaaaa atttaacatt 
tgaaatgcaa gaattacaaa cattcaattt tgaagaatta 
cccttctttt taggtaagga tgttgctgaa attttagggt 
atgttgatag tgaagatagg ctgacgcacc aaattagtgc 
caacgaatct ggattataca gtttaatctt tgaegcttet 
acegctagga aattcaaacg ctgggtaacc tcggaagttt 
aagtacctag tgacccaatg caagcattga gattaatgtt 
taaaaacgtg aaagatgatg ttattgattt gaaagaaaat 
ttaaccagaa caatcaatca aagagtagct catatacaaa 
gtagegaatt attcagggat attaattcag aagcgaaaaa 
aagacaaaaa catttcgacg atgtaactga aatgattget 
agaatcaagc aaattgaaat gaaattttaa aacgaaatat 
atgeagaege aaatgcgttt gtaaaaataa gtggcatttc - . 
gaacaaagag tttcaaaaag aatgcatgta cagatttggt^ 
aaagctattc aatttategg taccaattta atgattaatg* 
agtaaaactt ataaaagcta cctagtagca gtactatget 
ctctatactt cactacagcg tggtcaattg caggattege 
atacttttat gaagaataaa aaaactgeta ettgegtcaa 
atacaactta attaaatcaa aatatacgga ggtagtcaac 
ctattacact aaagatttct caggatacag aaatgaagaa 
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4 761 gataactttg tagcaaatca agaattgaca gcaacaacca cattgaacga gtacagaaaa cttattgaaa 

4 831 taaaggctgc taaagacaaa gaagaagaca cccacagagg taagtatttc gcggaagaaa gaaaaaacga 

4 901 aaaattggaa aaagaaaata taaaactaaa aaacaaaatc tatgaattac aaaacgaaga agataacgag 

4 971 gaggacgaag aagacaagga ggacgagaac gacgtattac aaaactggtg agataaaaaa caaaattaca 

5041 agctttaacg ggttcgaatt taaagtgccc gcgatgaaga gacacgacgg tatcagtaca caaatcaagg 

5111 atatgaataa cgttccacct aaaccgtttc atgccataga tttaagcgaa ctatatatcg cgacggatgc 

5181 aatgcgtgac gtcacaaacg aacggactga aaataacaca gatgaacagg acaaactaat taacttagtc 

5251 atgaaatggt aggaggtatg aaaagtgaat gactcacaag agagagaatt agaaacattc gaacaagacg 

5321 accgattcaa agtaactgat ctagacagtg ctaactgggt ttttaagaaa ctggatgcaa tcacaactaa 

5391 agagaacgaa accaacgact tagcaaataa agaaactgaa cgcataaacg aatggaaaga taaagaagta 

54S1 gaaaaattac agageggeaa agaatattta caaagccttg taattgaata ttacagaaca caaaaagaac 

SS31 aagatagcaa attcaagttg aatacacctt acggaaaagt gacagccaga aaaggttcaa aagtcattca 

S601 agttagcaat gagcaagaag ccateaaaca actcgagcaa cgaggttttg acaactatgt aaaagtaact 

5671 aaaaaaccta gccaaccaga cattaagaaa gatttcaatg Caactgaaaa cggcacattg attgacgcaa 

5741 acggcgaagt ttcagagggt gctagcattg tggagaaacc aacgtcatac acggcaaagg tgggagaata 

5811 gatgactgaa aaaactaatc aagatgtcga tatcctaacg caaccaggtg taaaagacat cagcaaacaa 

5881 aatgcaaaca agttttataa atttgcgaca cacggcaagt tcggtactgg taaaactacg tttttaacaa 

5951 aagataacaa taccttagta ctagatataa acgaggacgg aacaacggta acagaagatg gggcagttgt 

6021 gcagattaag aattataagc attttagtgc agtgattaaa atgctgccta aaattattga acaactaaga 

6091 gaaaacggaa aacaaatcga tgttgtagtg attgaaacaa tccaaaagtt acgtgatatc actatggacg 

6161 acatcatgga eggtaaatca aagaaaccga catttaatga ttggggcgag cgtgccacac gcattgtaag 

6231 tattcatcgt tatattteta aatcacaaga acattatcaa tttcaccttg ctataagcgg acacgagggc 

6301 actaacaaag acaaagatga tgagggaagt actatcaatc caacaatcac gatagaggca caagaceaaa 

6371 taaaaaaagc agtcatcagt caatctgacg tgttagcaag aatgacaata gaagaacatg agcaagacgg 

6441 cgaaaaaact tatcaatatg tacttaacgc tgaaccatca aatttattcg agacaaagat aagacactca 

6511 agcaacatca aaattaacaa caaacgtttc attaatccaa gcattaacga tgttgtacaa gcaattagaa 

6581 atggtaatta aaaattaatt aaaaggacgg tataaaaatt atgaaaatca ceggtagaac acaatacatt 

6651 caagaaacca accaagaggc attcatgaaa ggtggggact tttcaggagc tggagaattt acagtaaaag 

6721 ttgcaaatgt cgagtttaac gacagagaaa acagatactt cacgategtt tttgaaaaca acgaaggtaa 

6791 acaatacaaa cacaaccaat tcgtcccacc accccaacaa gattatcaag aaaaacaata tatcgagtta 

6861 cttagtagat taggaatcaa attgaaetta ccagatttaa cttttgacac agatcaatta attaacaaaa 

6931 tcggaactat tgtacttaaa aataaattta acgaggaaca aggcaagtat tttgtaagac tctcatatgt 

7001 aaaagtttgg aataaagacg atgaagtagt taataaacca gaacctaaaa ctgatgagat gaaacaaaaa 

7071 gaacagcaag caaacggtaa acagacacct acgagtcaac aatcaaaccc aetcgctaat gctaatggtc 

7141 caacagaaat caatgatgat gatttaccgt cctaggacgt ggtttaaatg caatacatta caagacacca 

7211 gaaagacaat gacggtactt actccgtcgt tgctactggc gttgaacttg aacaaagtca cattgattta 

7281 ctagaaaacg gatatccgct aaaagcagaa gtagaggctc cggacaataa aaaactatct atagaacaac 

7351 gcaaaaaaat attcgcaatg tgtagagata tagaacttca ctggggcgaa ccagtagaac caactagaaa 

7421 attattacaa acagaactgg aaattacgaa aggttatgaa gaaatcagtc tgcgtgactg ttcaatgaaa 

7491 gttgcgagag agttaataga actgatcata tcgtttatgt ctcatcatca aatacccacg agtgtagaaa 

7561 cgagtaagtt gttaagcgaa gataaagcgt tattatattg ggctacaatc aaccgcaact gtgtaatatg 

7«3l cggaaagcct cacgcagacc tggcacatta cgaagcagtc ggcagaggta tgaacagaaa caagatgaat 

7701 cactacgaca aaeatgtgtt agcactgtgt agacaacatc ataatgaaca gcacgcaatt ggtgttaagt 

7771 cgtttgatga taaatatcaa ttgcatgact cgtggataaa agttgatgag aggcccaata aaatgttgaa 

7841 aggagagaaa aatgaataag teactaatag atgactaccc gacacaagta ttaccgaaat tagctgaatt 

7911 aatagggtta aacgamgcaa tagtattgca acaaattcat tattggctaa acaactcaaa acataaatac 

7981 gatggcaaaa cttggatttt taattcttat ccagaatggc aaaaacaaet tceattctgg agcgagagaa 

8051 ctataaaaag gacatttggg agtttagaaa aacaaaactt attgcatgta ggtaactaca acaaggctgg 

8121 atttgaccgt acaaaatggt attcaateaa ttatgaaaca ttaaacaaac tagtggcacg accatcggga 

8191 caaaatggcc cgacgatgag gacaaattgg cacgatgcaa gaggacaaaa tgacccgacc aataccacag 

8261 actacacaga gactaacaaa catagagaga cagacgacgt ctcaaagtca eetaagtata ttagtaccaa 

8331 Cttagaaatt atacaaaacc ctttaaaagc agaacagtca gaacacgaaa ttaaatcatt taagcaagat 

8401 cagttcgaaa tagtaaaagt cgctaccgat tactgcaaag aaaacaacaa aggtccgaat tacctactaa 

8471 ctgtattaaa gaactggaat aaagaaggcg cttcagataa agaaagtgcc gaaaacaaat tgaaacctcg 

8541 taactctaaa aaagaaacta ctgatgatgt catagcacaa atggaaaaag aattgagtga tgactaatgc 

8611 cgatgagcaa aacacaagea ttagaaatta ttaaaaaagt taggtacgta tacaacatcg attttgataa 

8681 accaaagtta gaaatgtgga ttgatgtatt aagccaaaac ggggattatc aaccaaccgt aaaagctgta 

8751 gatggatata tcaacagtaa caaeccgcac ccgcctaacc taccagcaat catgcgtaag gcacctaaaa 

8821 aagtatctat tgagccggta gacaacgaaa ccgctacaca ccaatggaaa atgcagaatg accccgaata 

8891 cgtcagacaa agaaaaatag cgctagataa cttcacgaat aagttggcag aatttggggg cgataacgaa 

8961 tgaattacgg tcaattcgaa attgaaagca caacaatcgc cacgctacct aaacaaccgg acgtactaga 

9031 aaagataaga gttaaagatt acatgtttac gaacgaaaag tctaaaaccc ttttcaatta tgtaatggac 

9101 gccggaaaga tagatcatca agaaatctat ttaaaagcaa ctaaagataa agagttttta gatgcagaca 

9171 ctataactaa actttacaac cccgatccca ttggatacgg attccttgaa cgttatcaac aagaattatt 

9241 ggaaagttat caaatcaaca aagcgaaaga attggtaact gagttcaaac aacaacctac gaaccaaaat 

9311 tttaataact tgattgatga actcaaggac tcaaaaacaa ccactaacag aaaagaagac ggaaccaaga 

9381 agtttgttga ggagcccgtc gatgagttat acagcgatag ccctaagaag caaactaaga cgggttataa 

9451 gctcatggat tacaaaatag ggggattgga gccgtcgcaa tcaaccgtca tcgcagegeg cccctcagtg 

9521 ggtaagacag gttttgcatt aaacatgatg ctgaacatag cacaaaatgg atacaaaaca tctttcttta 

9591 gtctcgaaac aactggcaca ecagtattga aacgtatgtt atcaacaatt actggtattg agttaacaaa_ 

9661 gacaaaagaa atcaggaact caacgccgga cgacttaaca aagttaacga atgcgacgga taaaaccatg 

9731 aaattaggca tcgatatttc egataaaagt aatatcacac cgcaagatgt gcgagcgcaa gcaacgaggc 

9801 atteagacag gcaacaagtt atttttatag attatcttca accgatggat actgatgcga aagttgacag 

9871 acgtgtagca gtagaaaaga tatcacgtga cctaaagata accgctaacg agacaggcgc aatcatcgta 

9941 ccaccctcac aaccgaatcg cggtgtcgag tctagacagg ataaaagacc aatgctatcg gacatgaaag 

10011 aatcaggcgg aatagaagca gatgcgagtt tagcgatgct actttaccgt gatgattact ataaccgtga 
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10081 cgaagatgac agcatcactg gcaaatctat 

10151 ggaataattg aatttgagca ttacaagaag 

10221 cttattgaaa tcgatgtatg aagagacaaa 

10291 ggttgggcgg tcaatagatt gttggacaat 

10361 agaaaatcat gaatgaaacc aactggaaga 

10431 atattactta taccgagaag atggcacgga 

10501 gtttattcgc ceacaggagc ccatttcage 

10571 tcaaaggcgc ccacgggctt ctatacgagc 

10641 gtggcacaat gagcaaatac aatgctaaga 

10711 gtgcgaacac caccaatatt tagaaagtaa 

10781 aaatttgaat tacaacctaa attcgggaaa 

10851 aggaagggaa actggttgaa gteatagacg 

10921 gatattcaga tatcagtata gagatgtgaa 

10991 gaatggatgg tatacgagga cccagtgaaa 

11061 caacaacaag catatataaa cgcaacaatt 

11131 atgatgtgga taaagaaaaa gatacgctgg 

11201 tgacaacata acaataagac atgcatatat 

11271 aatttaaggg taatcatggc tagagataac 

11341 gatcaactat tagtggatat aaaaacggaa 

11411 agatgcccta ggtgttaatg taagtgaacc 

11481 attaaaaaag taaatgtata gaggtggaac 

11S51 taccgaacat gaaaatgaat tgataaaaaa 

11621 ggtggctggg cgttgttaga agccttacat 

11691 tgttatccaa aatcatggag cgagagagca 

11761 actacgtaag aagaagccac atttgtttaa 

11831 gteacetata accaaaegtt caagaaatgg 

11901 eatgaacaaa acgcaagaca atgtcaaaca 

11971 tttatcgaac aggctacggc acagtatcea 

12041 tgtctagagc accgttaaag aatggtcatg 

12111 tgactcgtgg gagggtcaac gatggcaacg 

12181 tgggattaga agactgtgaa aaatatacag 

12251 tgtgattgaa aactataaga caagcatatg 

12321 cctaattgtt aaaaggagtg atgaccatga 

12391 ctctaagaga tatctgtate aagacaacga 

12461 tttcacggac attataaaac gatgtttaaa 

12531 catatataaa gcaacatgat ctggaatatg 

12601 ataatggcaa agattaaaag aaaaaagaag 

12671 ctgaacaagt tgaaagtaaa gtgtttcaat 

12741 tttttcaact gatgggcacg ggttttacac 

12811 acagaggaag tcactgaaga tactgagttt 

12881 tatacgaaaa tgattcaatc agagagttga 

12951 taaaactatg acattaatct ggaaagatgg 

13021 gtatcaagtt ctttgcatct aaatgtgatt 

13091 ggaaaaagcc acgaacatgg tattaaaact 

13161 gtcgaaaaag tggaggcaat ataatgatac 

13231 tggagctgaa aatgttgact ctatcactga 

13301 gtttttaaag acgaacgtga tgagtacaag 

13371 gaaaacgtaa cgaagagctg gagaacatgt 

13441 ttactgtttt aaaattagag aactacaccc 

13S11 ggtaaaagca ctgcagatat tatactgtcg 

13581 ttttagggca aatggaggca gacacaaaeg 

13651 gtgcgtataa cggtaatgac acagaggggt 

13721 gtttgatgaa atacttgagg gaatgacaaa 

13791 gaagcagtag gggttatggc aggtcaagtt 

13861 gtaggagata aagtatacaa ccatgaaaca 

13^31 gagatacaca ttataaactg tctgatgatt 

14001 tctaactaag ggggacgagt gagtggaatg 

14071 aataaaaatc taaagtcggt atacgtaaca 

14141 ttgaaatttt taatgaagaa gtgttattaa 

14211 ttggattaat cctaaatctc ataagacgcc 

14261 tttgaatttt tggaggacga gtaaatgctt 

14351 atttaaacga cgactggtgg tacgagctag 

14421 tgatattgat agagtgctta aatttattga 

14491 taacagtaga tcaattaaaa gaactceeac 

14561 tttaaatgac acagtagcta gtatgattat 

14631 aattggaaga aacaaccagg taagccatta 

14 701 gtttgcaatt aactctgact actgttgatg 

34771 gattgaaaat gaagttactt tacctaaact 

14841 gaacaatttg taaaaggtat tgataatagt 

14911 cttactatac aatcgaceaa ctcattgacg 

14 9 81 tggaacagca gacgcaggaa aaggatacgt 

15051 gtgacgcaat acttagtcac aacattcaaa 

15121 gagataatca gacgtttaca gttgttgagg 

15191 agttaagata aggagagatg gagatgccaa 

15261 agtagctagt aatgaaatat cagaactatt 

15331 ggcgataata gagatatcga aaaaaaaaga 
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tgttgaatgt aacatagcca aaaacaaaga cggcgaaacc 
acccagaggt tcttcacatg aatataatgc aattcaaaag 
gcaaagcgac ccgattgtag caaatgtata tatcgagact 
aacgagtcat cgcctttcga tgattacgac agagttgaaa 
aaacacacat taaggagcgt taaaaaatgc cgaaagaaaa 
agacattaag gtcatcaagt ataaagacaa cgtaaatgaa 
gacgaaaaga aaattatgac tgatagtgac ctaaaacgat 
aagagctagg attgcaagca acgatatttg atatctagag 
aagtcgagta caaaggaatc gtatttgata gcaaagtaga 
tatgaatggc actaactatg atcgtaccga aatacaaccg 
caaagaccga ctacgcacac agccgaettc tctttgcgga 
ctaaaggtaa ggogaccgaa gttgccaaca ccaaagcgaa 
ctcaacgtgg atatgtaaag cgcctaaata cacaggtcaa 
gtcagacgta aaagaaaaag agaaatgaag cgatctaatg 
gatataagaa tacctacaga agctgaatat cagcattacg 
caaagcgctt agatgacaat ccggacgaat tactaaagta 
agaggtggaa taaatgaagt tgaacgaagt attcgcaact 
gtaagtgtcc aagatttgca caatgaaact ggcgtatcaa 
aagctgagat ggtcaactta aatgtattag ataaattggc 
atttactaga aatcacaaca cgcacaaatt agaggattgg 
aaacgagtat cgtaaagatt aacggtaaac catataaatt 
gaacggttta actccaggaa tggttgcaaa aagagtacga 
gcaccttatg gtatgcgctt agctgagtat aaagaaattg 
aagagcgtga aatggttagg caacgacgta aagaggctga 
tgtgcctcaa aaacattccc gtgatccgca ctggttcgat 
agtgaagcat aatgagcata atcageaaca gaaaagtaga 
accggcgcat tacacacacg gcaacattga aattatagac 
cctcaaccag cattcgcaat aggtaatgca atcaaatact 
aggatttagc aaaggcgaag ttttacgtcc aaagagcttt 
caaaaacaag ttgattacgt aatgtcatta caggaacaat 
acgaacaagt taaagctatg agtcataaag aagttagcaa 
ggatgaagag ctatataacg aatgcatgtc gtttggtctg 
acgatagcgc acgcaaagaa tacttaaacc aatttttcag 
gcgagtggca catatccatg tagtaaatgg cacttattac 
ggcgtgaaaa agacatttga tactgctgaa gagctcgaaa 
aggaacagaa gcaaccaact ttattttaga ggagatggaa 
atgacgctac tcgaaccggt ggaatgggca tggaacaacc 
cagatagaac gggcacgctt ggagaatgta gcgaagtaca 
aaaagcagta acagataaag atatttttac tgtagaaatc 
gattgtctag cagaactaaa cgatattgaa ggttttgaaa 
tagacggtac ttccagagcg ctttatatac taaacgaaga 
ggagttggta gtatgatgca aacctataaa gtatgtcttt 
ataaattaaa gaaacattat ttcgtgaaaa gtacgaatga 
gattcgtaaa aagctcccgt tcgaaactgc aagcatagaa 
aaccaacaag agaagaatta attaatttca tgaaaaaaca 
tgagcaaagt gcaataagac actttagagc tcaatcaaaa 
aagcaacgag atgagcttat cgaggatata gctaagttaa 
ggcgcacagt caaaaatgaa ttgcttggaa gatacgaaca 
tgagagcaaa gcgaacagga taggagctct ctatatagga 
cgaatggaag aactagacgg aacaaatgag ttetacgaat 
aataaccgtg aacaaataga acaatcagtg atcagtacta 
cactaaaaga gactgaggac gtgtataaga aagcgcaagc 
tgctattcaa caetcageta aagaaggtat tgaacttgat 
gtctataaat atgaggagga geaggaaaat gagtattagt 
aacgaaagtc tagagattgt gcaattggtc ggagatatta 
cagttattag cactatagat tttattacta aaccaattta 
gaaacgatta aaaaatgtgg tgccgcaccc agttatcaaa 
aaagataatg tgaaagaggt tcaaaaagaa ttaggtttct 
ctggattctt atcatttcaa aggataccta tttacattat 
tagatattac tttgetaacg agcatgagat tgaaagatat 
gaaatcatcg accaacgtga tgcattgcta gaagaaaagt 
attattggct gaataaacgc aagtcagaaa atgaacagat 
ggaattaaaa cgataggaga taacgaacaa atgaataatt 
aaacacaaaa ggagcccgac gatagaacac cgactagaaa 
tgaatctgcg gagcgggtta acacacttga gttttttaaa 
gatacacaat cagatgagat tgctgattac ttagctttca 
aagaagattt ggaagagact actgaggtta tggttgattt 
acattcagtc tattttgttc atgtaatgca tacactaaea 
attgcacaag tcttaataat gccttttttg tacgcca&ta 
catacaaaaa gaaaatgaaa aggaaccaeg aaagacaaga 
gtaaagacac cttagatcga gtcaaggagg ttttggggaa 
gattcaacag gacaaccaca tgaacatttt actgctgcta 
cggagagtaa agaaggagcg aaagagaagt acgagaaaca 
agaaaacggt aacgattgat gcagacgaaa acttattagt 
atatgaatat gacagtgagt taatgtcagc tgatgaagat 
gacgcattaa aacaagctac acaaattacc gataaaccaa 
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15401 catgtcgagg aggcagacga tgattaacat acccaaaatg aaattcccga aaaagcacac tgaaacaatc 

154 71 aagaaatata aaaacaaaac acctgaagaa aaagctaaga ctgaagacga tttcattaaa gaaattaatg 

15541 ataaagacag tgaattttac agtcctatga cggctaatac gaacgaacat gaattaaggg ctatgttaag 

1S611 aatgatgcct agtttaattg atactggaga tggcaatgac gateaaaaaa ctnaaaaata cggaccggtt 

15681 cgatatcttt attgctggaa tactgcgatt actcggcgta accgcactga cgcttgttgt cataccgcct 

1S751 atctatacag tggctagtta ccaaaacaaa gaagcacacc aagggacaat tacagataaa taeaacaaga 

15821 gacaagataa agaagacaag ttctatattg tgctagacaa caagcaagtc accgaaaact ctgacttact 

15891 actcaaaaag aaatttgata gcgcagacat acaagccagg ttaaaagtag gcgacaaage agaagttaaa 

15961 acgattggtc acagaataca ctttttaaat tcatatccgg ecccatacga agtaaagaag gcagataaat 

16031 aatgattaaa caaatattaa gactattact cttactagcg aegtatgagc taggtaagta cgcaaccgag 

16101 aaagtatata ttatgacgac ggctaatgat gatgtagagg cgccgagtga cttcgcaaag tcgagcgatc 

16171 agtctgattt gatgagggcg gaggtgtcag agcagacgta tagcaaagag tcaategcta atacgacagg 

16241 cacacataaa atgaagtgta atgtattagc tgatgtaata ccggaatatg atagcaactc aattgcacag 

16311 tatggcatac aagcaacgtt gccgaaacca caaggggaaa actcaagtaa agctgaagac gttgttgtga 

16381 ggcttgagag agcaaataaa aggtatgctc agacgttaaa agaggttgag tttataaacc aatcgcaaca 

16451 gagactggga cacgttgact tttgcttctt agagttattg aagaaaggtt aeaacaggga tgcgattatc 

16521 aagaagacgc ctaactctaa attaaataga aacaacttct tagcgcgccg tgaegagtta gcagaaaaga 

16591 tttatctact acagtgacga aaatgacaaa aacgacagaa atgacgaaaa cgacactatt tctaaactgt 

16661 gaattaattt tatacaattg aettgtaaga attatcttaa gacgcggggt aatagccaca tcagatgtcc 

16731 tcatcgatgt gattgagaag tgacaaacac acaaaagacg atatgttacg ceattaatca cctactacct 

16801 gcctatatgg tgggtagttt aattcctgca ttttgagtca taactaectc cctcctttca cattcactga 

16871 acgtagctcc tgcacaagat gtaggggcac tctttatatt taaacaacta gagtaatcaa cgtaaaggcg 

16941 tgcgatacag egaaaacaat tgaCtaaatc aacaccgaag caagaaaagt ttgtgccagg actcatagag 

17011 ggcaagagcc aacggaaagc atatattgac gcagggtate cgactaaagg eaagagtggg gaatacccag 

17081 ataaagaagc gagtacactt tttaaaaacc ggaaggttee cggaaggtac gaaaaaetgc gtcaagaagt 

17151 agctgaacaa tcaaaatgga cacgccaaaa ggcctttgaa gaatatgagt ggctaaagaa cgtagccaag 

17221 aatgacattg aaatagaggg agtgaagaaa gcgacagccg atgcattcct cgccagttta gatggtacga 

17291 atagaatgac gttaggtaac gaagttttag ctaaaaagaa aacagaaacc gaaattaaga cgcccgagaa 

17361 gaagattgaa caaatagata aaggtgacag cggaacagaa gataaaatca aacaacttca cgacgcaata 

17431 acggaagtga tcgtcaatga ataaacttaa atctctatac acggacaaac aaattgaaat attgaagcaa 

17501 acgcaaaaac aagattggct cacgttaact aaccacggag caaagcgtac aggtaaaaca ataccaaaca 

17571 atgacttatt cttacgtgag ttaatgcgtg cgcgaaagac agcagacgaa gaaggaattg agacacctca 

17641 atacatactt gctggtgcaa caetaggtac gactcaaaaa aacgtactaa eagagttaac eaacaaatat 

17711 ggcattgagt ctaattctga taaacacaac ecatecatgt tatttggcgt eeaagtggtt cagacaggtc 

17781 acagtaaagt aagtggtata ggagctacac gcggtatgac aCcgtttggt gcatacacea atgaagcgtc 

17851 gttagcgcat gaagaggtgt ccgacgagac taagccacgt tgtagtggaa ctggtgcaag aatattggta 

17921 gataccaacc ccgaccatcc cgagcaetgg etgttgaaag aetatatfcga aaatacagac cctaaagcag 

17991 gtatactgag tcaccaattt aagctcgacg acaacaacce tcttaatgat agatataaag agtctattaa 

18061 ggcttcaaca ccaecaggta tgttctaega acgtaatatc aacggtatgt gggtgtetgg tgacggtgta 

18131 gcacatgccg actctgatct gaacgagaac acgattaaag cagatgaaet ggacgacata cctatcaaag 

18201 aacacttcgc tggtgtcgac tggggttacg agcactatgg atctattgtg eeaataggac gaggtataga 

18271 cggtaacttt tattttattg aggagcacgc acaccaattt aagtttattg atgattgggt ggtcattgca 

18341 aaagatattg taagtagata tggcaatate aaettttact gegatactgc acgacctgaa tacatcactg 

18411 aattcagaag acatagatca cgtgcaatta acgctgaeaa aagtaaacta ecgggtgtgg aggaagctgc 

18481 caagtcgttc aaacaaaaca agteactcgt tctttatgat aatatggaea ggtttaagca agaggtattt 

18551 aaacatgtct ggcaccctac aaacggagag cctataaaag aatttgaega cgcgttggac tcgttaagat 

18621 acgccacaca cacacacact aaacctgaac gattaaggag ggggaaacga cattgtataa gttaatagat 

18691 gacactgaag cacaaggaat aetgcctaag catatcgagg ctctaataga gtcacataaa gacgatagag 

18761 agagaatggt taacctctat aatagataca agacacatat cgactatgta ccaatattca aacgtcgacc 

18831 aactgaagaa aaagaagatt ttgaaactgg tggaaatgta aggcgattag acgtgtctgt taataacaaa 

18901 cttaacaact ettttgacag cgaaattgtt gatacacgtg ttggttattt acatggtgtt cctgttactt 

18971 atgaettaga tgaaaacgca gaaaaaaacg aaaagttgaa aaagtttata accaactttg ccattagaaa 

19041 tagtgtcgat gatgaggatt ctgaaatagg taaaatggca gcaatttgcg gatatggtgc taggttagca 

19111 cacattgata cgaatggcga tattaggatt aagaatatag atccctataa tgttattttt gttggcgaca 

19181 atactccaga acctacatac tcattgcgct acttttatga aaaagatgat gataatggca ctgattacgt 

19251 gtacgcagag ttttacgata atgcttatta ttatgtattt cgaggagaag gtattgacgc tttgcaagaa 

19321 gctggacgat atgaacattt atttgattac aatccattgt tcggtgtacc taacaacaaa gagatgatag 

19391 gagatgctga aaaggttatt cacttaattg acgcatatga tttaacaatg agcgatgcat caagtgagat 

19461 tagtcagaca cgtttagcat accttgtgtt acgcggtatg ggtatgagtg aagaaatgat tcaagaaaca 

19531 caaaagagtg gcgcatctga gttgttcgac aaagatacgg acgttaaata cttaacaaaa gatgtaaatg 

19601 acacaatgat tgagaaccac ttagatcgaa ccgaaaagaa tatcatgcgt tttgcaaagt eagtaaactt 

19671 taattctgac gagtttaacg gaaatgtacc tatcattgga atgaaactta aacttatggc tttagagaac 

19741 aagtgtatga cgtttgagcg taagatgaca gctatgttga ggtatcaatt caaagttatt ttatctgcat 

19811 taaagcgtaa agggtacaac ttggatgatg atagttattt aaacctgata tttaagttca ctcgtaacat 

19881 tccagttaat aagttagaag aatcacaagt gccaattaac ctgaagggac aagtttcaga acgaacaagg 

199S1 ttaggacaat cacaactagt tgacgatgtt gattacgaac tagacgaaat ggaaaaagaa agtcttgaat 

20021 ttaatgacaa attaectgac atagatgaag gtgacgcaaa cgacaaatcc caaaataacc aatcagaatg 

20091 acattgatga gtatatcgag ggtttaatct ctaaagcaga aaaaccaata gaacaactat ttgctaatcg 

20161 acttaaagag ataaaacaaa tcatcgcaga tatgtttgag aaatatcaaa atgatgatgt gtatgtt-aea 

20231 tggactgaat tcaataaata caacaggctc aataaggagt taactcgtat aggtacaatg ctgacttatg. 

20301 actataggca agtagctaag atgattcaga agtcacaaga agatgcctat atagaaaaat tccttatgag 

20371 cctttattta tatgaaatgg cgagtcaaac atctatgcag tttgatgttc cgagtaaaga ggtaatcaaa 

20441 tcagctattg aacaacctat tgagttcatt cgtttaatgc caacaceaca aaaacaccgc gacgaagtat 

20511 tgaaaaagat acgtatgcac attacacaag gtattatgag tggagagggt tactctaaga tagctaaagc 

20581 aatacgtgat gatgtcggca tgtctaaagc tcaatcattg cgtgtggctc gtacagaagc aggcagagca 

20651 atgtcacaag ctggacttga tagcgcaatg gttgctaaag ataacggttc gaatacgaag aaacgttggc 
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20721 atgccactaa agatacacga acacgtgata 

20791 gaactttaaa tcaagtgggt gtgttgggca 

20661 aatattaatt gtcgttgcaa attactttat 

20931 gtaaagacga cggtaaaaat gaagttatcc 

21001 aggtggtaat tgatatggat tttaaaataa 

21071 acgcaccaaa cccctgeacg aagagataat 

21141 aaaaatgaag ctgatttaga tatggttaaa 

21211 cactttttta gttgtctctt tgctactcga 

21282 cggaeegtta gggtacgcga agggcaaaaa 

213S1 tttgaagaac acaaagacga taaagaagta 

21421 acgttaaagg ctttttagat acagaagaag 

21491 gaaaggacta gaatcatgga aagagaaaaa 

21561 cccgagcaat cagaagaaca aaaacgtatt 

21631 caaaacgtga gaagttaaga agtaacgcgc 

21701 tgacagattt ttaggcgatt ctgatgaaga 

21771 aagcatgttc aaaaaggcgt tgagtccaaa 

21841 aagatttaga cccttcaaat gtaaagtcca 

21911 tgaggtaata aaacatggca actccaacac 

21981 cgttattcca gcagaacaag gcactttaac 

22051 gctaaaaacg agccaacgac agcacaaaag 

22121 gggtatcaga aacggaacgt attcaaactt 

22191 aattggtgta attattccgt tatcaaaaga 

22261 aaacctccaa ttgcagaggc attttacaaa 

22331 acaacacttc aactagtggt aaaccgcctg 

22401 taataaetta cacgcagacc tttcggcact 

22471 gtattaacca cacgttcatt cagaagtaaa 

22541 atgctaacgg gaacgagatt acgggattac 

22611 accgttagca ctaatgggtg actgggacta 

22681 tctgaagatg ccacgttaac gacgttacaa 

22751 gtgatatgtt cgctttacgt gcgacgatgc 

22821 gcttaaacca actgaatagg aggagatatg 

22891 atatgactat taccgttaca aagaaggcat 

22961 atcacgtegt actacgtccg ataagagcga 

23031 ccaagaaaaa cggagcggaa gttaaaagtg 

23101 agaagatgtt agggatataa caaacaatga 

23171 aaaaagtatg tcgcagatgt cctagagcat 

23241 gtacggggac agtgtcgtac acttataacg 

23311 taaacgagca aagtttcatc cgtttaaacc 

23381 aattccetca cactatttct attggaagta 

234S1 tgtaagcgat aaaacaatea aaggatttat 

23521 atgtcacaag aatatgacag aaacctatat 

23591 agtatgaggg tagaatcttt agtattgaag 

23661 actacgactt aagcaggtgc catatggcaa 

23731 taagttcgat aagaaaatag aagagtgggt 

23801 actgctgtag cattagctce tgtcgaccta 

23871 gtgggttatc cagtgttata agtgtcggcg 

23941 tgctactggt cctggtggta gtcgtgctac 

24011 tacaccacac acggtcaagc gccacagcca 

24081 agcagtaett ttcatagagg cggttaaaca 

24151 taaaagatta atctcagacc ctaacattaa 

24221 gacgctgttt acccatatat tgttgtgggc 

24292 gagaaacagt cggtattgtc atacatgtgt 

24361 aagcgcgata ggttatgtgc ttaacagacc 

24431 gatagtcaag cagtattccc tgatatagac 

24501 acagacataa aaagaaaaac gaaggagtgt 

24571 agctgaaact gactcagaec cagtagaatc 

24641 gaaaatgatt tagctgaaat agtacgaggc 

24711 ttaaattaac aattggtaat gtgcctggag 

24781 tggacagttg cgtatatggc tttacgagcg 

24851 tatgttgttc cagaatcatt tgaaatgtca 

24921 ttaaatggaa tacagcagaa ggtgccgaag 

24991 tacagttgaa cacgaaaaat tcggcgaaaa 

25061 tctgactcac acacggaaga ccatcctacg 

25131 cacaaaaaaa ttgaaaagag gtatatactt 

25201 cggagaaaaa gaccacgaag tagaagcaaa 

25271 gaagatagcg aagatgggag aaaaggagca 

25341 ctagaaacaa agcgatctta caattttggg 

25411 acaattagaa aaagcaattg atgatttcat 

25481 ttggacaaac tcaacaatag tggttttttc 

25551 caccgaatat ggceaaaagc gaggacaaag 

25621 caaggaaacc atgggcgcag aaccctacac 

25691 gatatatccc cgaacacgaa ctgttagcac 

25761 ggataggtac ctagatcaaa gacaattatt 

25831 aagaggctaa ctagtatgac tcgtgacatt 

25901 ctcgtgtaca aaaagctaga etagaagaag 

25 971 accccttgaa ccgaaaggag gttagccttt 
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cccaccgcca cttagatggg gaatcagtgg aaatagatca 
ggcgcccaag ccacccaccg gCgcaaacag cgcgaaagag 
tatactgatg aaaatgaatt gccaactgta atgagagcac 
cactcatgac ctatcgtgag tgggagaaat ataagcgaaa 
aagtaaatgc tgatactggc gaagctatag aaaagttaga 
agagctacaa aacgaaaaag ttgttgtaaa cgtaacagtt 
acatctatta gcgaagaaaa tgccaaaaac aatgatttca 
ccctagcatg tcgttaaact gctttttatt atgcactttt 
ggagttttga tatatgaata tcgaagaagt taagtctttt 
aaagattatc taaagggact taagacggtg tctgttgatg 
gtaaacgact catccaacct gaatCagacc gttatcattc 
ccttgaggac ccaaccgaac aagaagtacg gaagcgtaat 
agtgcccrtg aacaagagtt agaaaaacgc gacgcagagg 
taggcaaagc gcaggaacca aatctaccaa catccttagt 
tactgagcaa aacctaaaag ctttaaaaga aacctttgac 
tttaaaccga gtggaagaga tgctaaagaa tcacgaaatc 
ctgaagaaat ggcgaaagaa atcaatatta gaaaataaag 
acacgccagg caatgttatt ctatcggatt ttaaaaacgg 
cacgaaagac actatggcca attcagcaat tatgaaatta 
aaaaaatcta ettacctagc aaaaggtgca ggcgcccact 
ctaagcctga atatgcgcaa gcagaaacgg aagctaagaa 
gtttcttaaa tggactgcaa aagatttctt taatgaggtt 
gcgttttgacc aagctgttat ctttggtact aaatcacett 
tcgaaggcge agaagagaaa ggtaacgttg ttacagatac 
aacggctact accgaagatg aagagttaga tccaaacgga 
atgcgtaatg ctttagatgc taatgacaga ccattatttg 
caecateeea tactggagcg gatgtatacg acaaaaagaa 
cgcacgtcac ggtatcttac aaggtattga gtatgcaatt 
gcatcagatg cttctggcca accagtatca ttatttgaac 
atatcgcata catgaacgtt aaaecagaag cgttcgcaac 
atggctaatc ctgcagaaga gactaaggta aaaaaagaca 
ttgactctta tcacagtctt gtcggttaca aagaggctaa 
gtgataaaaa tgactcttta tgaagatgtt aaacttttac 
acgaagaaga aatatttaag atggaagttg acggaatact 
ttttatgaaa gatggtcaag tcatttatcc ttactcaatc 
tatcaacgac ctgaagttaa aaagaattta aagtcaagaa 
atggrgtccc tgactacatt agtggagtat taaacaggta 
aataaggcag aggtgtcgtt tgtgtttaac ccatacgacg 
tcaaaaaagt aggagagtat ccaatcatac aagagcgctt 
ggatacgcct actacatctg aacaactaaa atttcatcaa 
gtaccttatg acttgccaat atctaaaaac aatttacttg 
gtgattctgt agatcagggc ggacaacacg aaattaagtc 
aagttaagta cggtgctgat agcatggttg ttgaattgga 
taaaaaaggt atcgctaaaa caacgacgaa gatttacaac 
ggttttttag aagaaagtac tgactttaaa tatetcgatg 
cagattatgc aatatacgtt gaatacggta ctggtataca 
aaagattccg cggagcttta aaggtgatga cggcgaatgg 
tettggaacc ctgcaattga cgcaggacgc aagaeattcg 
tgtgggtatc agttgagccr gaacttacaa atcaaatata 
caaactagrc gatgataggg tttttgacgt tgrccaagat 
gaatcaaacg tcactaacaa cgaatctagc gcaacaatga 
attcacagtt cgctacacaa tacgaggcta agctcatttt 
catagaaaca gataattacg agtttcaatt tagccgtatc 
aggcttacta agcatggcac gatacggctt ttattcaagt 
attaaaeggc gcaaaaaaac tatttagcag ttgtacgtcc 
cttattatta gccgacttac aagaaggtgg acatacgatt 
ggtaaaacgg actattctcc caatgcaatg tcagaatcat 
ataaaggaae cgaagcagtg aaacacgctg tacaaacagg 
taataaacgt gcagacggCa aacatcacgg aatgtttggt 
tttgatgatg aaagtgacaa aatcgaacta tcattaaaag 
ataacttgcc gaaagagtgg tttgaagctg caggtgcgcc 
agccggaaca ttcgagaatc aaaagaaagc tagtgttgta 
eaaaccaata gatcaagggg gcgeaagcec cetaettttt 
tgactgaatt taatccaatt acaacattaa aaattaatga 
agcaacactt gcacctgacc gaaaagctga aaaattctca 
acgccaggat tcaatgttac ccttaacggc ttgctagaat 
aatgtgctac tgcttattta aaaaacccac caactcgaga 
cactgaaaac gaggatactt tgccgttatt acaaggggct 
aagagggaga gtcgctcgta ccggatgaca ctgaacaaag 
aaacgacgaa agcaggcata gaaacgatga aagagaatta _ 
gattactcaa aaataaggca actgacagce agatatttag 
caacacctgc cgaatggcgt gattggctta ttggtggtca 
aattgaacaa gcgcaagcta acggcttagt acaagcctct 
gagaaacaac gttacgaaac aagagaacct ggtagctatg 
aaaaaagaag acgcgaactc ctcaaagaag gtacaagaaa 
ggatacccac tttatggcaa agattatggc caatattaga 
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26041 gatetccaaa gcaacgtaag gaaagctcaa 

26111 atgtaaaagc agaeaetcca agattccaaa 

26181 agagcacccc gctaaattat ccatgaaaac 

26251 gtagagcgat tcaaacaaca taaagtagat 

26321 caactaaagc cactgtcgaa gcttggagaa 

26391 aatggcggct aaagggccta aagaagatct 

26461 agatggaaac raggaaataa attcacaaaa 

26531 gaagaattgg tcagattatg agaaaagaag 

26601 attgaaagat tacggcgaga aaatggacgc 

26671 caacaggtca aaggcttaaC gattgctagt 

26741 caataatggc agtacttaat gcggttggtg 

26811 tgtcgcaggt cccggagttg ttggctttgg 

26881 acattggcag taacaaaaga agctcaaaac 

26951 atattgttaa agagaatcaa gcaagtatct 

27021 gatgtctcaa ctaaaaecat tcttatccga 

27091 aattgggtta aacattccga aacagctaag 

27161 tcggagattt actgaacgct gcaggacgat 

27231 gttgttcaaa tttgtgtctc aaggaccaca 

27301 gctggtcaga aegctattaa agcgtttatt 

27371 ttggtaatgt gttcgctggt atcggeaatt 

27441 ttggttggtt aaattaacct cccaatttag 

27511 gactttatea gttatgttca agagaatggt 

27581 tagttgcttt tggtactgca atggctccta 

27651 atttaecgct aaactattcg aaacacaccc 

27721 ggtgtatttt gggctttaat ggctccgatt 

27791 gcetattcag cgtcactgaa aagattttag 

27861 agcattaata ggtgcattcg gttcgatttc 

27931 attggtgtcc tcgtttattc atggaaaaca 

28001 gtgttaaaac ggcagtttct ggtgcgattc 

28071 ccaatctacc ttacaaccga caatgcctat 

28141 gttttggtaa taggcatcat cacaaacgct 

28211 cgttccaagc cacaggaaca gtgatatccg 

28281 tcagttgctt actggcgact tctcaggtgc 

28351 acgatttggc aatacatgca atcagtttgg 

28421 cactttctat gtttggtaca agttggtcac 

28491 gaacactgtt acaagttggt tcagccgagt 

28561 tttattatca caaaaggttc tgaatgggtt 

28631 tagctgatgg gtttaaaaga gttgtctcaa 

28701 aagtttcttc agtgatttct taaatgccgg 

28771 tctgcgcaca aagtagtcag cgcggtaggc 

28841 taagtggaca cggtggaggt agtagcctag 

28911 agactttggc agcgccttta ataaagagct 

28981 aettctatag acagacaeat gactagcgac 

29051 atgtaacgat tagaaatgag ggcgaccttg 

29121 cggaagttcc a act tat tat aagggaggct 

29191 gttcacagta tcgcgtcagt gacaatcctt 

29261 aggcgcagga tatcatcgta actattctga 

29331 gaagaactta aaaaagtaga gcetaagata 

29401 agtcagacgc ccaagcacca tttgctggac 

29471 taagtatgag catatattag atataecaaa 

29541 caactttttg taggactagt aagtgaagtt 

29611 cgtttgaaac aaccgaacta ccatactttg 

29681 ccctgaaaaa tggccggtac ct gat agar t 

29751 tacaacacta acecaggaga agtttattat 

29821 ctgttgaaat agagctagct gaagatgtca 

29891 aggaaatatc tcagttatta aggaagttga 

29961 acctaeagag gecatttaaa tatagattce 

30031 ggaatcgact caagtctaat aaagtaatga 

30101 agtagcctac gccaatttta ttaaaaagcc 

30171 aagtaaaaag ctaaatgaag atagttcttt 

30241 ataggtgcca taactaaaat gtggacgatc 

30311 ttgtcatact tgataagtct actactggcg 

30381 tgatgacctt aacaattcta ggatttacca 

30451 actgtctcta aaggaacggg ttataagtat 

30521 taggcaaagg agatacacga ttagaaatct 

30591 cgacgcaaag actaaaacgt ttcacttgca 

30661 ggtgtgaatg ctgataacgt caaaatacaa 

30731 gtgattttga tggacaacag acttttgcag 

30801 attgataggt aaaagagaag cgccaccgcc 

30871 gcaatggagt tattgataaa gaaaagtgtc 

30941 atttcccaga agctaacccc aaaataggtg 

31011 ctcagcgaga atagtcgaaa tcactacaca 

31081 ttaggagact ttacaaggcg taatcgttat 

31151 caaaatctac aaaatccgac ccatctaaag 

31221 cataaataac gaattggtta agcagaatga 

31291 gttacaactg ccaatggtac gatcatgcac 



195 

cgattagcaa agacgtccgt accaaacgaa aetgaaacag 
gagctttaca acgcgctaaa tcaatggctc aacgacggcg 
agatgagcac aaagcgaatt tagaacgcgc taaagcccaa 
ttgaaactaa gtaacactga actaatggcc aaatacaatg 
aacatgttgt taagttggat ttagatgcaa accccgccaa 
aatagatctt agcaggcata gttccgacat tgattccagc 
gaattcaatg aagtcgaagg agcagttaaa cgttctttcg 
taaatggaac aagtgatatt tggggtaaac ccaacaactc 
cttagctact aaaatccgaa ctcccggcac tatcttcgcg 
atacaagcat tgataccagc gatcgccgga ttagtacctg 
tattaggtgg cggcgtctta ggtttagttg gcgcattctc 
tgcaatggct actagcgccc ttaaaatggt tgaagatgga 
tttagagatg cgagcgatca gtcaaaaact acatggcgtg 
ttaatgcgat gtcagcaggt atcagaggcg ttacaagtgc 
agtatctatg ctagttgaag caaacgcacg cgagtttgag 
aaagcgtttg aagcactgaa tagcataggt ggcgcaatct 
ttggcgacgg atcagttaac attttcactc aattaatgcc 
gaacatgtct atagccttcc aaaactgggc taatagtgta 
gactacacta ccactaactt acctaagatt ggtcagatat 
caatgattgc ttttgcacaa aacagttcca acatttttga 
agcatggtca gaacaagtag gacaatcaca agggtctaaa 
cctactatta tgcagttaat cggcaatatc gtaaaagcat 
tagctagtaa attgttagac tttatcacta acctagctgg 
agctatagca caagttgctg gcgtcatggg tattttaggc 
gttgctataa gtagtgtact tacaaatgtg tccggtttga 
acttcgttag aacatcaagt tcagttactg gagctacgga 
agcacctatt ttagcagttg ttgcagtaat tggtgcattc 
aacgagaact ttagaaatac tattactgaa gcgtggaacg 
aaggtgtagt cggctggtta actgaattgt ggggcaaaaC 
attgcaagta ttaggacaaa tattcatgca agttttaggt 
atgaatatca tacaaggtct gtggacttta attacaattg 
tagcagtcca aatcatagta ggtttgttca ctgctttaat 
ttgggagact attaaaacta cggttaccaa tgtgcctgat 
gagtcaatta ccggcttttt aactggcgta acgaatcgaa 
agatatggag tacaatcact aattttgtta gcagtatttg 
ggcttcgagt gtagctgaaa aaatggggca agcactaaac 
tctaacattt ggaatacagt Cacaagtttc gcgagtaaag 
atgtaggtga cggtatgagt gatgcacttg gtaagattaa 
agcggaatta atcggcaaag tagccgaggg tgtagccaaa 
gacgcgattt catcagcttg ggactctgta acttcatccg 
gtaaaggttt agcggtacca caagcaaaag taattgctac 
atcctctact ttgacagata gtatagtaaa tcctgtaagt 
gttcaacata gcttaaaaga aaataataga cctactgtga 
atttaattaa atcacgcatt gatgacatga acgccataga 
tgttagttga tagcgcacga catagaagta acaaggaatg 
tcacttataa tcacttggaa gtagttgaat ataacgttac 
catagagggt attgatggta gattccataa ttacgctaaa 
aggtataaag cacccaaaat tgcttatgct tcacatttaa 
gtttttattt aagggaatta gctacaccag acaattcaat 
agacaaacaa gcatttgagc ttgaetatgr cgatggacga 
tcttttgaca caacacaaac atcaggggaa ttttctttgt 
aaagtgtcgg ttatagtact gatcttgaaa gcaataacga 
gcctacaaac gaaggtgata agaggcgtca aatgacattt 
aacggtgatg ctcctttaac acagtttaat cagtttaatg 
aagctaatga taaggatgga ttcactttct atacagataa 
tttaaaagcc ggagacaaaa taatcttcga cggtaaacat 
tttaataaaa ctttagaaca accggtttta tatccaggct 
aacaaactac atttagacac aaactacatt ttagataagg 
tacagggtgt agggcacgct attaatgtta gtacaaaggt 
ggatctaacc attatcgaga acgcgagtac gtttgacgca 
actcatgttg aaggtgaaga tgatttcaac gaatatgtaa 
aaaaaataag gcttgatatc aaagccaggc aaaaagaact 
agagtataac gaaagtttca caggcgttga gttcttcaat 
gtateacatc caaaagtaga cgcacctaaa ctcgagggat 
ttaaaaaagg acttgagcgt tatcatctcg aatatgaata 
tgatgaatta tctaagtttg ccaattatta cattaaagct 
gaagatgcat ctaaatgtca tacctttatt aaaggttatg 
aagcgggacc acaaattgaa ttcactcatc cattagcaca 
tgttgatgga cgtattaaaa aagaagatag tttaaaaaaa 
actgcttcta tttccccaga ctttgtagcg ttacgtgaac 
atgttgttag agcggtggat tccgeeatag gatataacga 
tagagatgcg tacaacaata tcactaagca agatgtagta 
aacaaagcag ttcatgatgc tgcaaattat gttaaaagcg 
aactaaaagc actaaacgca aaagctaacg caagtttatc 
aaaaacaaac gctaaagtcg ataagatgaa tactaaaaca 
gaccctacca gtcaatcaag cataagaaac accaaaccaa 
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31361 ttggaacgat eggcgactct gtagccagag 

31431 gaaaetgaaa gctaaaacga ctaatcttgc 

315Q1 gaagcggtag aaaacagcat ttatagacaa 

31571 ctgatgatga ctggctacac ggttattggg 

31641 gtettacggt gccttttgtt ctgcaattga 

31711 atgacagcta caagacaatg ccccatgagt 

31781 tagggttaac acttgaggac tatgtaaacg 

31851 tgacgeatat cacacagatc accccaagcc 

31921 ttacacccta acgaaaaagg tcaegaggtt 

31991 actaaaggag gcaaccaacg gceeacggat 

32061 cgctcaacac gagcacaact accgctcgtt 

32131 catcaaaaag aagaaacata cgcacactca 

32201 atetaacgta tttaaatagc cgttttagca 

32271 aaaagacgcg cgtattgata atacaggtta 

32341 tcaacactag aegctetcac caaaaaggte 

32411 cagaacaccg attcgaacca aaagagcaag 

32481 agcaacgcaa ecattccggg tagaccctag 

32551 cactacatgt tacceagatt gaagcccaac 

32621 acggtacaca caatgcgtac agatacattg 

32691 caaaaacaac aagtttgtac gcttccaata 

32761 gtcatgccga atatatttaa cgacagatat 

32831 tcagacgtga atataaagce tccgaaagac 

32901 tgacgatatt gataaaggta tagacaaagt 

32971 acacaaccta tgcaaggtat cacttatgat 

33041 ccaaccctaa ccacctacaa ggtttcgata 

33111 cggcggtgtg aataataact ttaaaggaga 

33181 gaaacaggac gtaaagcact tttaataggg 

33251 attctatcgg ccaaagaggt gttaaccaat 

33321 aggtggacgt gttaaaccgt taccaaeaca 

33391 tactatae.ee aeaegcaaga cacacaaaat 

33461 ggtggtccte ggatgtactg cctggacact 

33531 aggtagaaae atgcetaaat tcgaacgtgt 

33601 ttccgtccgc aaaaegcegg ttatcgggaa 

33671 tcgttggtte agatttctat atcaetactg 

33741 aggtategca ggttggatat tagaagtaaa 

33811 aataacctcc cgtcegcaca ecaaetttta 

33881 tattcgaagg aaaggtggtt gaataatgat 

33951 caaacaacat cacaaeataa tccaattatt 

34021 gtgttttaaa ttttgcagta actaagaata 

34091 eaecgtgcta aaaaccgaeg attataacgt 

34161 gaegcaaeca acgggcgttt gcagtaegtg 

34231 ctcaggcatt ctetacacaa aacgggagta 

34301 aaatgattta gtcagtgggt tcgatggtat 

34371 gaagcagtcg gcaaagacec eaaccaaeta 

34441 tgaatgatag tgcgacaaaa ggcattcaac 

34511 tgcgacgcaa actagtgcaa cacaagctgt 

34581 atttttgaac gtgttaacga agttgaacaa 

346S1 caaattggca aaagtctaaa cttacagatg 

34721 eagegetcta agegcagcea acacatctag 

34791 aeggatatag gcacgetaga gaagcccgga 

34861 cttatacatc aagcaaaccc ggtgtgttag 

34931 geacccagac gactcaaacg atgagtacac 

3 SO 01 aagaatgatg gaaacetaac taagcaatte 

35071 agtatgtaga tgataaaetc ggaacaacga 

.35141 aatecaagtt aaceeaaata atgegcaagg 

35211 agagegcegg aeetaccagg eagtgttgaa 

3S281 ca&acaagct atttaactte acgccctata 

3S351 acttgagcaa cagtggacag ttcctaatga 

35421 geaggtacaa caatcaaece aaccgaacca 

35491- caggtggcge cattgaggga ttcggactaa 

35561 agtegaccca gaeggeaacg geggeggtat 

35631 agaatcgata acgatgtgta ctttgattea 

35701 ctataactaa aattatgggg tggaaataat 

35771 gttaatactg gcggtttacg caatagttta 

35841 agttcgaacc eagaaagttc gttetcacta 

35911 cgtaccgaat gcatcaaacc aacaaagtgc 

35981 agtaegeaaa egcagacgac gcaagegaac 

36051 cacaacagtt gaccgaactg aaaactaaca 

36121 teatccaact tteaaagaca eeaaaacttt 

36191 cacgcagaca egggtgeaac cgacaaagaa 

36261 aagatgaaaa gtcacaggtg taatgcctga 

36331 gaeeeaccaa aeggcacgaa cacgaaeggc 

3€401 cactctcaat gagattaaae taggtcaaaa 

36471 gatgetaece agagggaaag acagaeagac 

36541 tgaaaatgtg gattcteggt ttgataggga 

36611 teteggeaet taaaggaggt gattaccaeg 
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ggccgcacgc aaaaaccaat cccacagaaa tgttaggcaa 
aagaggegge geaacaaegg caacagtccc aacaggcaaa 
gcagagcaaa caagaggaga cceaaccaea eeacaaggca 
caggcgcacc gataggcace gataaaaegg atacaaaaac 
agetattaga aagaataacc cagattcaaa aatactagtg 
ggcacaacaa tacgeegtaa agacaeggae aaaaacaaac 
ctcaaatatt agcttgtagt gagttagatg taccagegtt 
atacaaccca gcecctagga aagegagcat ggaggaegge 
attatgeacg agecaatcaa ggateattac agttetcacg 
eaattacaag tttacattca atgacaggtc ggaaaaeagt 
agaegaaggc acgagcaaac etgagaaaat gttcacacac 
gcgaaacaaa eeaaacacec gaaegacage gcegaagact 
aeaegattcc aggecaeaac ggegaeggea ecaatgaagc 
tggtcataag acattgeaag aecgcttgta ecatgateat 
gagaaagctg tagacgaaca ctataaagaa tatcgagega 
aaceggaaee eatcactgat etatcgecat atacaaatgc 
aacgaaaaee aeetaeaega cgcaagcecg tccaggtaat 
ggacaactta ttgacagate gceegeeaaa aaeggeggee 
atggagaaet atggaceeat ccagctgtae eggacagtaa 
eagaacegga gaaaeaactt atggtaatga aacgeaagae 
aegtcagega tttataatcc eacagaaaat ttaacgattt 
aagctaagaa tccaccgaat ttcattgaag taagaagtgc 
aetgtatcaa atggaeatac ccaeggaaea cacttcagat 
gcaggtatce tatattggta tacaggegae ecgaaeacag 
taaaaacaaa agaategtea tttaaacgac gtaecgatat 
ctcccaagaa gctgagggtc tagatatgta etacgaecta 
gtaactattg gacctggtaa taacagacat cactcaattt 
ectcaaaaaa cactgcacct caagtatcga tgactgattc 
gaacccagca eaeceaageg acateaegga agetggecae 
gcaetagatt tcccgttacc gaaagcgttt agagatgeag 
ataatggtgc tceaagacaa geacetacca gaaacagcac 
categacatt ttcaataaga aaaacaaegg agcatggaat 
cataccccta agagtateac aaaactaeca gatttaaaaa 
aagaaecaaa acgatttact gattecccca aagactecaa 
atcgaaeaca ccaggtaaca caacacaagt ateaagacgt 
getagaaace eeggtactgg tggcgttggt aaatggagtt 
agtagataat ttttcgaaag acgaCaactt aatcgagtta 
gacacaaaca tcagtttcta tgaatcagat agaggaactg 
acagacegtt atctataagc tctgaacatg ttaaaacatc 
agaeagaggc gettacatct cagacgaate aacgaeagca 
ataccgaatg aatttttaaa acattcaggc aaggtgcatg 
ataatgttgt tgttgaacgt caatttagct tcaatattga 
aacaaagctt gtttatatca aatctattca agatactatc 
aagcaagata tggatgaeac acaaaegtta atagcaaaag 
aaaecgaaat caagcaaaac gaagceaeac aagceaceac 
eacagetgaa gtcgaeaaaa eagetgaaaa agagcaagcg 
caaatcaatg gcgctgacct cgteaaaggt aacccaacaa 
attaeggeaa agcaategaa tcgtaeg&gc agtccataga 
gaetattcae attactaatg caacagatgc gecagaaaag 
caagaeggcg tegatgaegg ttcttcgttc gatgaatcaa 
ttgtttatge tgtcgaeaat aacactgctc gegcaacaeg 
aaaaeacaaa aeccacggca caeggtaccc gceccacaaa 
geegaagaaa cgtctaacaa cgctetaaat caagctaagc 
gceggcaaca acacaagaeg acagaggega aeggtcaatc 
cgatttggga tatctaactg ctggtaatta etatgeaaca 
agetacgagg gctatttaec ggeatecgee aaagacgata 
aceceaaaaa gaeetacaca cgatcaatca caaaeggcag 
acaeaageca aeggtaetgt ecgacggtgg agcaaaeggt 
tacacaaact attctatttt attagtaagt ggaacttatc 
ccacattacc taatgeaatt caaetaagta aagcgaatgt 
ccaegagtgt etaccatcca aaacaagtag caccacttca 
ggtaaaacac caggttctgg agegaaegee aacaaagtta 
gaaaatcaca geaaaegaea aaaatgaage tateggatae 
gatgeagacg ataacaacge gtctatcaaa tecaaagaag 
aeggegaaae eaaatacaat agcaatctcg aaaaagaaga 
gecagaeeta agtgatgagg aacttegegg aatggttgca 
atgetgacaa egcaactgac gcaacaaaac gceatgteaa 
aaacaaatac tgagggggac gtttaaatga tgaagatgat 
ttatgtgtgg ggttgctata aaaatgagca aaCcaagtgg 
gaatatgeat tgatcactgg tgaaaaatat ccagaggcaa _ 
ggccceeeaa tttaacacaa agtaggrggc gtaatgtctg 
gaaeeagaag aceagaagag aaegaeaaaa caatgettag 
aacccaagag caagttaaca ttaaattaga taaaacttta 
gaaaaaaata agaaagaaaa cgacaaaaat atacgegata 
ctatcttcag tacgattgtc aeagctttac taagaactat 
eteaaaggga etteaggaca cagcctcegg gcgtgcttct 
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36681 ggtttggtaa acgtaaacaa cagccaagag ccagcgcttc ggcactggct ttttatttcg attgaaatga 

36751 ggtgcataca egggaceacc taacccaaag accagaaagc ccacagctag tgaagtggtg gagtgggcaa 

36821 agtcgaatat tggtaagagg atcaatatag ataaccatcg gggcagtcaa cgttgggaca cacctaactt 

36891 tatttttaaa agataeeggg gttecgeaac atggggcaac gctaaggata tggctaacta cagatatcct 

36961 aagggtttcc gattctatcg ttattcaccc ggateegtac cggaacctgg agacaccgca gtttggcacc 

37031 ctggcaacgg aacaggctcg gacggacaca ccgcaatagc agtaggacca tccaataaaa gccattttta 

37101 tagcgccgac caaaaceggg ttaattctaa tagttggaca ggctctccag gaagattagt aagacaccct 

37171 tatgtaagcg ceacaggctt tgtcaggcce ccacactcaa aagatactag caaacctagt agtaccgaca 

37241 caagctcagc atcaaaagcc aatgacccaa caattactgg cgaagcgaag aaaccgcaac ttaaagaagt 

37311 taaaacagta aaatacactg cttacagcaa cgctctagat aaagaagagc acttcattga tcatatagtt 

373 Bl gtaacgggtg atgaacgctc agacactcaa ggattataca taaaagaatc aatgcatatg cgctccgtag 

374 51 acgaactgta cacgcaaaga aataagtcca eaagcgaeea tgaaataccg catttatatg tcgatagaga 

37521 ggctacatgg cttgctagac caaccaattt tgacgacccg cgtcacccca attggctagt taccgaagta 

37591 tgtggtggte aaacagatag caaacgacaa ttcttactga atcaaacaca agcgccaata cgtggtgctc 

37661 ggttactgtc agggattgat aaaaacttat ctgaaacgac gctaaaggca gaccctaata ttcggegtag 

37731 tatgaaagat ttaattaact acgacttgat taagcaaggc ataccggata acgcaaagta tgagcaagtt 

37801 aaaaagaaaa tgcttgagac atacattaaa cgagatatat cgacacgaga aaatataaaa gaagtaacga 

37871 caaaaacaac aataagaact agtgacaaaa caccagtcga cagtgcgccc acacgaggcc ctactccatc 

37941 agacgaaaaa ccaagcatcg ctactgaaac aagtccaccc acaeeccagc aagcactgga cagacaaatg 

38011 tctaggggra acccgaaaaa accccacaca cggggccggg ctaatgcaac acgagcacaa acgagctcgg 

38081 caatgaatgt taagcgaata tgggaaagca acacgcaaCg ctatcaaatg cttaatttag gcaagtacca 

38151 aggcatttca gttagtgcgc ttaacaaaat acttaaagga aaaggaacgc tcgacggaca aggcaaagca 

38221 ttcgcggaag cctgtaagaa aaacaacact aacgaaacct atttgaccgc gcacgcttcc ttagaaagtg 

3B291 gatacggaac aageaacecc gcCagcggca gatacggtgc atataattac ttcggtattg gtgcattcga 

38361 caacgaccct gactacgcaa cgacgtttgc caaaaacaaa ggttggacat crccagcaaa agcaatcatg 

38431 ggcggtgcta gcccegcaag aaaggactac atcaacaaag gtcaaaacac attgtaccga atcagacgga 

38501 accctaagaa tccagctacc caccaacacg ctaccgctac agagtggtgc caacatcaag caagtacaat 

38571 cgctaagtta tataaacaaa ccggcttaaa aggtatccac tccacaaggg ataaatacaa acaaagaggt 

38641 gtgtaaatgt acaaaataaa agatgctgaa acgagaataa aaaacgaegg tgttgactta ggtgacattg 

38711 gctgtcgatt ttacactgaa gatgaaaaca cagcacctat aagaataggt atcaatgaca aacaaggtcg 

38781 tatcgatcta aaagcacatg gcttaacacc tagattacat ttgtctatgg aagatggctc tatattcaaa 

38851 aatgagcccc ttattatega cgatgctgta aaagggttcc ttacctacaa aatacctaaa aaggttatca 

38921 aacacgctgg ttatgttegc tgtaagctgt ttttagagaa agaagaagaa aaaatacatg tcgcaaactt 

38991 ttctttcaat atcgttgata gtggtactga atctgctgta gcaaaagaaa tcgatgttaa attggtagat 

39061 gatgctatta cgagaatttt aaaagacaac gcgacagatt tattgagcaa agactttaaa gagaaaatag 

39131 ataaagatgt catttcttac atcgaaaaga argaaagtag acctaaaggt gcgaaaggtg acaaaggcga 

39201 accgggacaa cc tggtgcga aaggtgatac aggtaaaaaa ggagaacaag gcgcacccgg taaaaacggt 

39271 actgtagtat caatcaatcc tgacactaaa atgtggcaaa ttgatggtaa agatacagat atcaaagcag 

39341 aacctgagtt attggacaaa atcaataccg caaatgttga agggttagaa gataaattgc aagaagttaa 

39411 aaaaaccaaa gatacaactc tcaacgaccc taaaacgcat acggattcaa aaaccgctga actagttgat 

39481 agcgcgcctg aacctacgaa tacattaaga gaattagcag aagcaataca aaacaacrct atttcagaaa 

39551 gtgtattgca acagattggc ccaaaagtta gtacagaaga ttttgaggaa tteaaacaaa cactaaacga 

39621 tttatatgct ccaaaaaatc ataatcatga tgagcggtat gttttgtcat ctcaagcttt Cactaaacaa 

39691 caagcggata atttatatca actaaaaagc gcatctcaac cgacggttaa aatttggaca ggaacagaaa 

39761 atgaatataa ctatatatat caaaaagacc ccaatacact ttacctaatc aaggggtgat ttctatggaa 

39831 ggtaatttta aaaatgtaaa gaagtttatt tacgaaggtg aagaatatac aaaagtatat gctggaaaca 

39901 tccaagtacg gaaaaagcct tcatcttctg taataaaacc cttacctaaa aataaatatc cggatagcat 

39971 agaagaatca acagcaaaat ggacaacaaa tggagtcgaa cctaataaaa gttatcaggt gacaatagaa 

40041 aatgtacgta gcggtataat gagggccccg caaactaatc caggttcaag tgatttagga atateaggag 

40111 ccaatagcgg agttgcaagt aaaaatacca actttagtaa ccctccaggg atgttgtatg CeaccaCaag 

40181 tgatgtttat tcaggatctc caacattgac cattgaataa ttetaaacga ctaatttttt agtcgttttt 

402S1 tattttggat aaaaggagca aacaaacgga cgcaaaagta ataacaagat acaccgtatt gaccttagca 

40321 ctagtaaatc aattcttagc gaacaaaggc attagcccga ctccagtaga cgatgagact atatcatcaa 

40391 taatacttac tgttgtcgct ttatatacta cgtataaaga caatccaaca tctcaagaag gtaaatgggc 

40461 aaatcaaaag ctaaagaaat ataaagctga aaacaagtat agaaaagcaa cagggcaagc gccaattaaa 

40531 gaagtaacga cacccacgaa catgaacgac acaaacgatc cagggtaggt gttgaccaac gctgacaaca 

40601 aaaaaccaag cagaaaaacg gtttgacaat tcattaggga agcagcecaa tectgatttg ttttatggat 

40671 ttcagtgtta cgattacgca aatatgtttt ttacgatagc aacaggcgaa aggttacaag gtttatacgc 

40741 ctataatatt ccatttgata ataaagcaag gattgaaaaa tacgggcaaa taactaaaaa ctatgatagc 

40811 tttttaccgc aaaagttgga cattgtcgtt-ttcccgtcaa agtatggtgg cggagctgga catgttgaaa 

40881 ttgttgagag cgcaaattta aacactttca caccatatgg gcaaaattgg aatggtaaag gttggacaaa 

40951 tggcgttgcg caacccggrt ggggtcccga aactgtcaca agacacgttc attatCacga tgacccaatg 

41021 tattttatta gattaaattt cccagataaa gtaagtgtcg gagataaagc taaaagcgtt attaagcaag 

41091 caactgccaa aaagcaagca gtaattaaac ctaaaaaaac tatgcttgca gccggtcacg gtcataacga 

41161 tcctggagca gtaggaaacg gaacaaacga acgcgatttt atccgtaaat atataacgcc aaatatcgct 

41231 aagtatccaa gacatgcagg tcacgaagct gcateatacg gcggctcaag tcaatcacaa gacatgtatc 

41301 aagatactgc atacggtgtt aatgtaggaa ataataaaga ccacggacta tattgggtta aaccacaggg 

41371 gtatgacatt gttctagaga cccacccaga cgcagcagga gaaaatgcaa gtggtgggca tgttattaCc 

41441 tcaagtcaat tcaatgcgga tactatcgat aaaagtatac aagatgctae taaaaataac ctaggacaaa 

41511 taagaggtgt aacacctcgt aatgatttac cgaacgttaa tgeaccagca gaaataaata tcaaecaccg 

41581 tttacctgaa ttaggtttta ccactaataa aaaagatatg gactggatta agaagaatca tgacctgtat 

41651 tctaaaccaa cagccggtgc gacccatggt aagcctatag gtggttcggt agccggtaac gtcaaaacac 

41721 cagctaaaaa ccaaaaaaac ccaccagcgc cagcaggcca cacacccgac aagaacaaCg cgccCCataa 

41791 aaaagagact ggtaattaca cagttgccaa tgctaaaggc aacaacgcaa gggacggcta ctcaactaat 

41861 tcaagaatta caggtgtatt acctaataac gcaacaatca aatatgacgg cgcatattgc atcaatgggt 

41931 acagatggat tacttatatt gccaatagcg gacaacgccg ctacactgcg acaggagagg cagacaaagc 
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42001 aggtaatagg ataagtagtt ttggtaagtt 

42071 caccaatcac agggaatctt acagttatta 

42141 ccaacatcac tctcaagate taaacgtaga 

42311 taatgtaatt acactaccag caaccaatct 

42281 gaggacttac ttgcgtaaag tagtaagaag 

42351 gctgtttttc atgttatatt ataaatgacc 

42421 tatgcaaaaa aaacgaaaaa aagttcataa 

42491 ataccagttg agaggaggat aaaaagtgct 

42561 atgtcagcaa ttgccatagc gaaaacattg 

4 2631 tccatatata aaccccaaca ctaaaatact 

42701 taaacgtgtt tttaggcaac gacataagca 

42771 ettatggaag agggataaaa atgacagcaa 

42841 agaaacggga tataaaattg ctaaaaatec 

42911 aaaacatctt tatcagatgc cagatttaga 

42981 acgaagaaga taaacaaaag gagccaaaaa 

43051 aaagaagtat ttgaatcagg taaaaactct 

43121 atgatagata cgtagtactt gaccataaaa 

43191 caaaagaaaa ttagtaagtt aaataattag 

43261 cgcgtgtcaa acacgtgtca atttagttct 

43331 cgcatagtta taggcttttc agctatatac 

43401 tggaaacctt gaettaatgg ggttttaatc 

43471 cacgttgacc ttgctctttt ttatgttcac 

43541 ataatggccc aaccttttgc eaatatattc 
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tagcacgatt tagtatctac ttagaataaa aatttegcta 
aataactatt tggatggatg ttaatattcc catacacttc 
caacaggcag gtactacggc actegcccat tttcttgtta 
ggcttaaaac cacatttccg gcagccaatc cggccatgca 
ctgactgcat acctaaacca cccacaccag ttgctgggtg 
aaaccacacc acctatcaac ccaggagtgt ggttattttt 
aaagtaetgc atatcaegtt taaccgtgtt ataataaggt 
agaaaatttt aaaactatag cagaaatcgc cttttacaca 
aaaaaagacg ataagtaagt agacaagccc gaaagggctg 
atgaaaacaa tttacattat tttaatcact cttatttgga 
aaagtgttgt tgcactgctt actactttac tgcttatcaa 
taaaagaaat aattgaatca atagaaaagt tattcgaaaa 
cggattaeca tatcaaaccg tgcaagactt aagaaatgga 
acgataacaa agttatacga gtatcaaaga tcgcttgaaa 
tacgtttgtt acaaaagaag aacttaaaac tttgaatgta 
ataaaaatta cagatggaag acacgcaata catcgggcaa 
aaggcgattt gtacccgcaa aaagcatacc caaaatatat 
aaaaccacgt cttaattgac gtggttactt tttaggtttg 
atttctttag ttttcctcct aaacttaatt gctcgtaaac 
caagataaga tttateccgc cgtctccata aaaatatgct 
tagcaagtgt caaatatgtg tcaagaaaat aattttctga 
caagtaagtg agagtaggtg tctaaagtta tagatatatt 
aatagg 
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Table 10 



Bacteriophage 96 ORFs list 



SID 


LAN 


FRX 


POS 


a. a. 


RB3 aequenca 


3TA 


STO 


100733 


96ORF001 


1 


2S999. .29142 


1047 


c c 1 1 gaat eg aaaggaggt t age c t 


ttg 


taa 


100734 


960RP002 


1 


32008. .33906 


632 


cttttacgactaaaggaggcaacca 


atg 


taa 


100735 


96ORF003 


1 


30109. .31995 


628 


ttatattttagataaggagtagect 


atg 


taa 


100736 


96ORF0O4 


1 


36760. .38634 


624 


attttgattgaaatgaggtgeatae 


aC ? 


taa 


100737 


96ORF005 


3 


33903. .35729 


608 


gt t tat tcgaaggaaaggtggt tga 


ata 


taa 


100738 


96ORF006 


2 


40589. .42043 


484 


aat gat t 1 aggg taggtgt tgac ca 


atg 


tag 


100739 


96ORF007 


1 


18652. .20091 


479 


c at acaeacat act aaacc tgaacg 


att 


tga 


100740 


96ORF0 0 6 


2 


8960. .10201 


413 


tggcagaatttgggggcgataacga 


atg 


tga 


100741 


96ORF009 


2 


17447. .18670 


407 


gaegcaat aacggaagtgat cgt ca 


atg 


tg a 


100742 


96ORF010 


1 


38647. -39819 


390 


t aaat at aaat aaagaggtgt gtaa 


, at ? 




100743 


96ORF011 


-1 


119. .1195 


358 


gtagctcgcctacccttactatttt 


teg 


tga 


100744 


96ORF012 


2 


20045. .21013 


322 


t t Caatgacaaa t tacctgacat ag 


atg 


tga 


100745 


960RF013 


3 


291S7. .30098 


313 


acttattataagggaggtttgttag 


ttg 


taa 


100746 


96ORF014 


1 


21925. .22839 


304 


agaaaataaagtgaggtaataaaat 


atg 


tag 


100747 


96ORF015 


1 


5812. .6591 


259 


aeacacggtaaaggtgggagaatag 


atg 


taa 


100749 


96ORF016 


1 


7852. .8607 


251 


aataaaatgttgaaaggagagaaaa 


atg 


taa 


10074 9 


9SORF017 


3 


3444. .4190 


248 


aaat t taacat t aatat cact 1 1 aa 


gtg 


taa 


100750 


96ORF018 


-3 


28281. .29000 


239 


taagctatgttgaacatcgctagtc 


atg 


tga 


100751 


96ORF019 


3 


7188. .7859 


223 


t t taecgr tctaggacgtggt t taa 


atg 


taa 


100752 


960RF020 


3 


21324 . .21908 


194 


gaagggcaaaaaggagttttgatat 
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ttg 


tga 


101074 


960RP342 




40913. .41023 


36 


tatctgggaaatttaatctaataaa 


ata ... 


tga 


101015 


960RP343 




39173. ,39283 


36 


t gecacat t c t agtgt caggat tga 


ttg —J 


.tali 


101076 


960RF344 




37580.. 37690 


36 


gggtctacctttaacgtcgtttcag 


ata 


taa 


101077 


960RP345 




315S6. .31666 


36 


ggattattctttctaataacttcaa 


ttg 


tga ! 


101078 


960RF346 




29972. .30082 


36 


ggctactccttatctaaaatataat 


ttg 


taa 


101079 


960RP347 




28787. .28897 


36 


ctgccaaagtctgtagcaattactt 


ttg 


tga 


101080 


960RP348 




21839. .21949 


36 


ttaaaatccgataaaataacatcgc 




tga 


101081 


960RF349 




3647. .3757 


36 


taaaacc t ccgaagt tacccagcgc 


ttg 


tga 
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101082 


96ORF350 


-2 


40801. .40911 


36 


accattccaattttgcccatacgat 


gtg 


tag 


101083 


960RF351 


-2 


38953. .39063 


36 


tatcttttaaaattctcgtaatagc 


ate 


taa 


101084 


960RF352 


-2 


3158S. .31695 


36 


tagctgtcatcactagtatttttga 


ate 


taa 


101085 


960RF353 


-2 


24550. .24660 


36 


atagtccgttttaccgcctcgtact 


att 


tag 


101086 


960RF3S4 


-2 


20083 . .20193 


36 


atcatcattttgatatttctcaaac 


ata 


tga 


101087 


960RP3SS 


-2 


991. .1101 


36 


gcatcttggcagtacgacgtaaaac 


ate 


tag 


101088 


960RF3S6 


-3 


38148. .38258 


36 


c aagaaagcgc gcgcga t caaat aa 


att 


tga 


101089 


960RF357 


-3 


8790. .8900 


36 


tgaagttacccagcgctatttttct 


ctg 


tag 


101090 


960RF358 


-3 


4458. .4568 


36 


ttcataaaagtattctttgtagtat 


acg 


tag 


101091 


960RP3S9 


1 


4666 . .4773 


35 


ttatcaaaatatacaacttaattaa 


ate 


tag 


101092 


96ORF360 


1 


11569. .11676 


35 


acaaatttaccgaacatgaaaatga 


att 


tga 


101093 


960RF361 


2 


6122. .6229 


3S 


ggaaaacaaattgatgttgtagtga 


ttg 


taa 


101094 


960RF362 


-1 


40418. .40525 


35 J 


ttcgtaggtgtcattacttctttaa 


ttg 


tag 


101095 


960RF363 


-1 


34358. .34465 


35 


gttttgcttgatttcgatttgttga 


atg 


tga 


101096 


960RF364 


-1 


20654 . .20761 


35 


ctatttccactgattccccatctaa 


atg 


tga 


101097 


960RF365 


-1 


8423. .8530 


35 


tcctttttagagttacgaggtttca 


att 


tag 


101098 


960RF366 


-1 


2402 . .2509 


35 


t gacgc a t ggcaacac t 1 1 agac ca 


ate 


taa 


101099 


960RF36 7 


-2 


36607. .36714 


35 


aaaacaaaaagccagtgccgaagca 


ctg 


tag 


101100 


960RF36 8 


-2 


27061. .27168 


35 


caaatcgtcctgcagcgttcaataa 


ate 


tag 


101101 


960RF369 


-2 


26470. .26577 


35 


atgagttgttaagtttaccccaaac 


ate 


taa 


101102 


960RF370 


-2 


10327. .10434 


35 


ccgtgccatcttctcggtataagta 


ata 


taa 


101103 


960RF371 


-2 


8650. .8757 


35 


gggtacgggt tgt t actgt tgat at 


ate 


taa 


101104 


960RF372 




14382. .14489 


35 


gttcttttaattgatctactgttaa 


att 


taa 


101105 


960RF373 




8151. . 8258 


35 


atgrtttgttagtctctgtgtagtct 


atg 


taa 


101106 


960RF374 


-3 


5007. .5114 


3S 


aaacgatttaagtggaacattattc 


ata 


taa 


101107 


960RF375 


2 


30563 . . 30667 


34 


eg a t t ag aaa t c t t t aaaa aagga c 


ttg 


tga 


101108 


960RF376 




19916 . .20020 


34 


tctatgtcaggtaatttgtcattaa 


att 


taa 


101109 


960RF377 


-1 


9236. .9340 


34 


cttttctgttagtaattgtttttaa 


ate 


taa 


101110 


960RP378 




9026. .9130 


34 


actctttatctttagttgcttttaa 


ata 


tag 


101111 


960RF379 




28447. .28551 


34 


c 1 1 1 tgt gat aat aaagt 1 1 agt gc 


ttg 


tga 


101112 


96ORP380 


-3 


40329. .40433 


34 


c cat 1 1 acct t ct tgagatgt tgga 


ttg 


tga 


101113 


960RF381 




39801. .39905 


34 


caaaag atgaaggctt 1 1 1 ccac ac 


"9_ 


taa 


101114 


960RF382 


~3 


33B31. .33935 


34 


atgttgtttgtaactcgattaagtc 


ate 


tga 


101115 


960RP383 


_ -i 


33687. .33791 


34 


gttattacgtcttaatacttgtgtt 


gtg 


tag 


101116 


960RP384 


— 


13530. .13634 


34 
34 


t at aegcactagt act gat cact ga 


ttg 
att 


taa 
taa 


101117 
101118 


960RF385 
960RF386 


1 


3843. .3947 
12256. .123S7 


33 
33 


tttgatcgattgttctagttaagaa 
agtcataaagaagttagcaatgtga 


ttg 
ate 


tag 
tag 


101119 
101120 


960RF387 
960RF3B8 


2 
2 


2207. .2308 
2S19. .2620 


33 


t ccaagact ct t taac tgt t aa ct t 
attgttgaatttcgattgatctaaa 


atg 


tga 


101121 


960RP389 


2 


22517. .22618 


33 


agaagtaaaatgegtaatgetttag 


atg 


tag 
taa 


101122 
101123 


96ORF390 
960RP391 


2 
2 


27302 . .27403 
32384. .32485 


33 
33 


t t ccaaaat tgggc t aatagtgt ag 
actaaaaaggttgagaaagctgtag 


ctg 
acg 


taa 


101124 


960RF392 


2 


39287. .39388 


33 
33 


aaaaacggt ac tgt agt at caat ca 


ate 


tag 
taa 


101125 
101126 


960RP393 
960RF394 


3 
3 


18153. .18254 
24189. .24290 


33 


gt agt at at gecgact ttgattt ga 
tcagaccctaacattaaeaaactag 


ttg 


tga 


101127 


960RF395 


-1 


15266. .15367 


33 


ccgataatttgtatagettgtttta 


atg 


tag 


101128 


960RF396 


-2 


32239. .32340 


33 


ttttagcgaaagcatctagtgttga 


ata 


tag 


101129 


960RF397 


-2 


16123. .16224 


33 


ttatgtgtgcctatcatattaacaa 


ttg 


tag 


101130 


960RF396 


-2 


13648. .13749 


33 


tctttaactgaatgttgaatagcat 


ttg 


tag 


101131 


960RF399 


-2 


10987. .11088 


33 


acttctgtaggtattcttatatcaa 


ttg 


tga 


101132 


960RF4 00 


-2 


3382. .3483 


33 


cttactggtaattcttcaaaattaa 


atg 


taa 


101133 


96ORF401 


-3 


40794. .40895 


33 


ccacatgatgtgaaagtgtttaaat 




taa 


101134 


960RF402 


-3 


39978. .40079 


33 


acattcctaaatcacttgaacctaa 


att 


tga 


101135 


96ORF403 


-3 


38607. .38708 


33 


a t c 1 1 cagt gt aaaat cgacagc ca 


atg 


tag 


101136 


96ORF404 


-3 


21288. .21389 


33 


cagacaccgtcttaagtccctttag 


ata 


taa 
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Table 11 

SEQUENCE INFORMATION FOR PHAGES MATCHING WITH TABLE 1 

M32695 

Bacteriophage PM2 nuclease cleavage site 
gi|166145|gb|M32695[BM2NCS [166145] 

(View GenBank repon.FASTA reportASN.l report,Graphical view,l MEDLINE link, or 1 nucleotide neighbor ) 
M32693 

Bacteriophage PM2 Hind III fragment 4 
gi|l66144|gb|M32693|BM24HIND3 [166144] 

(View GenBanJc report.FASTA report^SN.l report,Graphical view,! MEDLINE link, or 1 nucleotide neighbor ) 
M32693 

Bacteriophage PM2 Hind III fragment 4 
gi|166144|gb|M32693|BM24HIND3 [166144] 

(View GenBank report,FASTA reportASN.i report,Graphical view,! MEDLINE link, or 1 nucleotide neighbor ) 
M32694 

Bacteriophage PM2 Hind HI fragment 3 
gi|166143igb|M32694|BM23HIND3 [166143] 

(View GenBank report,FASTA rcportASN.l report,Graphical view, or 1 MEDLINE link ) 
M26134 

Bacteriophage PM2 structural protein gene containing purine/pyrunidme rich 
regions and anri-Z-DNA-IgG binding regions, complete cds 
gi|289360|gb|M26134|BM2PROTIV [289360] 

(View GenBank report,FASTA report^SN.l report,Graphical view,! MEDLINE link, or 1 protein link ) 
J02452 

bacteriophage fl 3 '-terminal region ma 
gil2l5409|gb|J02452IPFITR3 [215409] 

(View GenBank report,FASTA reportASN.l report,Graphical view, or 1 MEDLINE link ) 
AF020798 

Bacteriophage Chpl genome DNA, complete sequence 
gi|2 ! 776 1 Idbj |D00624iBCP 1 [217761] 

(View GenBank reportfASTA report^SN.l report,Graphical view.l MEDLINE link, 12 protein links, or 1 genome link ; 
X72793 

Closmdium botulinum C phage BONT/C1, ANTP-139, ANTP-33, ANTP-17, ANTP-70 
genes and ORF-22 

gii5l6171|emb|X72793|CBCBONT [516171] 

(View GenBank reportJASTA reportASN.l report,Graphical view.I MEDLINE link. 6 protein links, or 4 nucleotide neighbors ) 
X51464 

Clostridium botulinum D Phage C3 gene for exoenzyme C3 
gi|14907|emb|X51464|CBDPE3 [14907] 

(View GenBank report,FASTA report^SN.l report,Graphical view.l MEDLINE link, 1 protein link, or 2 nucleotide neighbors ) 
D9O210 

Bacteriophage c-st (from C. botulinum) Cl-tox gene for botulinum CI neurotoxin 
gi|217780|dbj|D90210|CSTClTOX [217780] 

(View GenBank reportJFASTA report^SN.l report,Graphical view, 1 MEDLINE link, or I protein link ) 



1 
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S49407 

type D neurotoxin [bacteriophage d- 16 phi, host - C. Botulinum, type D, CB16, Genomic, 4087 nt} 
Si|260238|gb|S494O7JS49407 (260238) 

(View GenBank report.FASTA report,ASN.l report,Graphical view,l MEDLINE link, or 1 protein link ) 
X53370 

Bacteriophage phi29 temperature sensitive mutant TS2(98) DNA polymerase gene 
gi|l5733|emb!X53370[POTS298 [15733] 

(View GenBank report,FASTA reportASN.l report,Grapbical view.l MEDLINE link, 1 protein link, or 7 nucleotide neighbors ) 
X5337I 

Bacteriophage phi29 temperature sensitive mutant TS2(24) DNA polymerase gene 
gi|15731|embfX53371|POTS224 [15731] 

(View GenBank report,FASTA report,ASN.l report,Graphical view.l MEDLINE link, I protein link, or 7 nucleotide neighbors ) 
X05973 

Bacteriophage phi29 prohead RNA 
gi!15680{emb|X05973|POP29PRO [15680] 

(View GenBank report^FASTA report^SN.l report,Graphicat view,2 MEDLINE links, or 4 nucleotide neighbors ) 
V01 155 

Left end of bacteriophage phi- 29 coding for 15 potential proteins Among 

these are the terminal protein and the proteins encoded by the genes 1, 2 (sus), 3, and (probably) 4 

gi|15659(emb|V0l 155|POP29B [15659] 

(View GenBank report, FASTA reportASN. I report,Graphical view.l MEDLINE link, 16 protein links, or 16 nucleotide neighbors) 
X73097 

Bacteriophage phi-29 left origin of replication 
gi|312194femb|X73097|BP29ORIL [312194] 

(View GenBank report,FASTA report^SN.l report,Grtpbical view,! MEDLINE link, or 5 nucleotide neighbors ) 
M14430 

Bacteriophage phi-29 gene- 17 gene, complete cds 
gi)215321|gbjM14430|P29G17A [215321] 

(View GenBank report,FASTA reportASN. 1 report,Graphical view.l MEDLINE link, 6 protein links, or 8 nucleotide neighbors ) 
MI4431 

Bacteriophage phi-29 gene- 16 gene, complete cds 
gi|215319|gblM14431|P29G16A [215319] 

(View GenBank report,FASTA report^SN.l report,Grapbical view.l MEDLINE link, 2 protein links, or 7 nucleotide neighbors ) 
M20693 

Bacteriophage phi-29 DNA, 3* end 

gi|2 15343|gbtM20693|P29REPINB [215343] 

(View GenBank report,FASTA reporUSN.l report,Grapbical view.l MEDLINE link, or 4 nucleotide neighbors ) 
M21016 

Bacteriophage phi-29. DNA, 5' end 
gi|215342|gbjM210l6fP29REPINA [215342] 

(View GenBank report^ASTA rcport^SN. I report,Graphical view.l MEDLINE link, or 1 nucleotide neighbor ) 
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Ml 2456 

Bacteriophage phi-29 genes 9, 10 and 1 1 encoding p9 tail, incomplete, plO 
connector, compete, and pi I lower collar, incomplete, respectively 
gi|215338|gb|M12456|P29P9 [215338] 

(View GenBank repott,FASTA tepon,ASN r . I teport,Graphical view.l MEDLINE link, 3 protein links, or 2 nucleotide neighbors ) 
Ml 4782 

Bacillus phage phi-29 head morphogenesis, major head protein, head fiber 
protein, tail protein, upper collar protein, lower collar protein, pre-neclt* 

appendage protein, morphogenesis( 1 3), lysis, morphogenesis( 1 5), encapsidation genes, complete cds 
gi[2 1 5323|gb|M 1 4782|P29L ATE2 [2 1 5323 ] 

(View GenBank report,FASTA repon,ASN.l report.Graphical view, 1 MEDLINE link, 1 1 protein links, or 1 1 nucleotide neighbors) 
M26968 

Bacteriophage phi*29 (from Bacillus subtilis) proteins p \ delta- 1 genes, complete cds. and the susl(629) mutation 
gi|34l558|gb|M26968|P29PiDl A [341558] 

(View GenBank report,FASTA report,ASN.l repon,Graphical view.l MEDLINE link. 2 protein links, or I nucleotide neighbor ) 
J02448 

Bacteriophage fl , complete genome 
gi| 1 66201 |gb|J02448|FlCCG [166201] 

(View GenBank repon,FASTA report,ASN. I report, Graphical view. 1 MEDLINE link, 10 protein links, 205 nucleotide neighbors, 
or I genome link ) 

M24832 

Bacteriophage f2 coat protein gene, partial cds 
gi|166228|gb|M24832|F2CRNACA [166228] 

(View GenBank reportJFASTA reportyASN. I report,Graphical view, 1 MEDLINE link, 1 protein link, or 4 nucleotide neighbors ) 
J02451 

Bacteriophage fd, strain 478, complete genome 
gi[2l5394|gb|J02451|PFDCG [215394] 

(View GenBank report^ASTA reportASN.l report,Grapbical view,5 MEDLINE links, 10 protein links, 204 nucleotide neighbors, 
or 1 genome link ) 

M34834 ' 
Bacteriophage fr repliease gene, 5* end 
gi|l66139|gb(M34834|BFRREGFA [166139] 

(View GenBank report,FASTA reportASN.l report,Graphical view.l protein link, or 9 nucleotide neighbors ) 
M38325 

Bacteriophage fr repliease gene, 5' end 
gi|l66l37|gb[M383251BFRREGR [166137] 

(View GenBank report^ASTA reporuASN.l report, Graphical view, 1 protein link, or 9 nucleotide neighbors ) 
M35063 

Bacteriophage fr coat protein repliease cistron (R region) RNA 
gi{l66l34jgblM35063|BFRRCRRA [166134] 

(View GenBank report,FASTA report, ASN.l report,Graphical view, 1 protein link, or 3 nucleotide neighbors ) 
S66567 

alpha-aerial natriuretic factor/coat protein =i fusion polypeptide [human, 
bacteriophage rr, expression vector pFAN 15, PlasmidSynthetic Recombinant, 510 nt] 
gi|435742|gb|S66567|S66567 [435742] 

(View GenBank report,FASTA report^SRl report,Gnsphical view.l MEDLINE link, 1 protein link, or 15 nucleotide neighbors ) 
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X15031 

Bacteriophage frRNA genome 

gi| 1 507 1 |emb)X 15031 ILEBFRX [15071] 

(View GenBank report.FASTA report.ASN.l repon,Graphical view.l MEDLINE link, 4 protein links. 9 nucleotide neighbors, 
or 1 genome link ) 

U51233 

Mus musculus neutralizing anti-RNA-bacteriopbage fr immunoglobulin variable 
region light chain (IgM) mRNA, partial cds 
gi| 1 2771 50|gbfU5 1 23 3 |MMU5 1 233 [1277150) 

(View GenBank repon,FASTA report,ASN. 1 report,Graphical view.l protein link, or 1669 nucleotide neighbors ) 
U51232 

Mus musculus neurralizine anti-RNA-bacteriophage fr immunogiobulin variable region heavy chain (IgM) mRNA. partial cds 
gi|1277I48|gb|U51232|MMU51232 [1277148J 

(View GenBank report,FASTA repooASN. 1 repon.Graphical vicw,l protein link, or 1073 nucleotide neighbors ) 
U02303 

Bacteriophage Ifl, comolete genome 
gi|367628O|gblUO23O3|B2U02303 [3676280) 

(View GenBank report,FASTA report,ASN.l report, Graphical view, 10 protein links, or 1 genome link ) 

V00604 
Phage M 13 genome 

gi]14959(emb!V00604|INM13X [14959] 

(View GenBank report,FASTA report^ASN. 1 report,Graphical view, 1 MEDLINE link, 10 protein links, or 205 nucleotide 
neighbors ) 

A32252 

Synthetic bacteriophage M13 protein IH probe 
gill 567340|emb|A32252|A32252 [1567340] 

x (View GenBank report,FA5TA reporvASN. 1 report, or Graphical view) 
A32251 

Synthetic bacteriophage Ml 3 protein IE probe 
gi|1567339|emb|A32251|A3225I [1567339] 

(View GenBank report,FASTA report^ASN. 1 report, or Graphical view) 
Ml 2465 

Bacteriophage M 1 3 mp 1 0 mutations in lac operon 
gi|2I52l0|gbjM12465|M13LACMUT [215210] 

(View GenBank report,FASTA reporv\SN.l reportGraphical view.l MEDLINE link, or 215 nucleotide neighbors ) 
M24177 

Synthetic Bacteriophage MI3 (clone M13.SV312) SV40 early promoter region DNA 
gi|2094 1 6|gb|M24 1 77|SYNSVB 1 2 [2094 1 6] 

(View GenBank report,FASTA report^SN.l repon,Graphical view.l MEDLINE link, or 1 nucleotide neighbor) 
M24176 

Synthetic Bacteriophage M13 (clone MI 3. SV3 II) SV40 early promoter region DNA 
gi|2094 1 5]gb|M24 1 76|SYNS VB 1 1 [2094 1 5] 

(View GenBank repoaFASTA report^ASN.l repon,Grapbical view.l MEDLINE link, or 1 nucleotide neighbor) 
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M24175 

Synthetic Bacteriophage M13 (clone M13.SV.S) SV4Q eailv cromoter reeion DNA 
gi|208806|gb[M24 1 75JSYNM 1 3S V8 [208806] 

(View GcnJBank report,FASTA report.ASN.i report.Graphical vtew.l MEDLINE link, or 242 nucleotide neighbors ) 
M19979 

Synthetic hybrids: recombinant DNA from bacteriophage M13 and plasmid pHV33 
gi|2078131gb|M19979|SYN33Ml3M [2078I3J 

(View GenBank report.FASTA repart,ASN.t report.Graphical view. I MEDLINE link, or 617 nucleotide neighbors ) 
M 19565 

Synthetic hybrids; recombinant DNA from bacteriophage M13 and plasmid pHV33 
gi|207808|gb|M 1 9565ISYN33M 1 3H (207808J 

(View GenBank reporx,FASTA reportASN.l report.Graphical view,! MEDLINE link, or 567 nucleotide neighbors ) 
M19564 

Synthetic hybrids; recombinant DNA from bacteriophage M13 and plasmid pHV33 
gi|207807|gb|M19564|SYN33M13G (207807] 

(View GenBank report,5ASTA reportASN.l report,Graphical view,l MEDLINE link, or 12 nucleotide neighbors ) 
Ml 9563 

Synthetic hybrids; recombinant DNA from bacteriophage M 13 and plasmid pHV33 
gi!207806)gbtM19563|SYN33M13F [207806] 

(View GenBank report JASTA reportASN.l report,Graphical view,l MEDLINE link, or 262 nucleotide neighbors ) 
M19561 

Synthetic hybrids; recombinant DNA from bacteriophage M13 and plasmid pHV33 
gi|207804|gb|M19561|SYN33MI3D(207804J 

(View GenBank report,FASTA reportASN.l report,Graphical view.l MEDLINE link, or 27 nucleotide neighbors ) 
M19560 

Synthetic hybrids; recombinant DNA from bacteriophage M 13 and plasmid pHV33 
gi|207803|gblM19560!SYN33M13C [207803] 

(View GenBank rcpon,FASTA report.ASN.1 report,Grapbical view, or 1 MEDLINE link) 
M19559 

Synthetic hybrids; recombinant DNA from bacteriophage M13 and plasmid pHV33 
gi|207802|gb|M19559|SYN33MI3B [207802] 

(View GenBank repon,FASTA reportASN.l report,Graphical view.i MEDLrNE link, or 227 nucleotide neighbors ) 
M10568 

Bacteriophage M13 repUcative form EL replication origin, specific nick location 
gi|2 1 5220|gb|M 1 0568 JM 1 30RIB [2 1 5220] 

(View GenBank repooFASTA reportASN.l report,Graphical view, 1 MEDLINE link, or 650 nucleotide neighbors ) 
M10910 

Bacteriophage M 13 gene D regulatory region and M13sjl mutant 
gi|215209lgb!M109l0|Ml3IIR£G [215209] 

(View GenBank repon,FASTA reportASN.l report,Graphical view.l MEDLINE link, or 72 nucleotide neighbors ) 
M38295 

Bacteriophage M13 HaeEII restriction fragment DNA 
gi!215208|gb|M38295|M13HAEHI [215208] 

(View GenBank report,FASTA reportASN.l report,Graobical view, or 67 nucleotide neighbors ) 
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DNA encoding a pan of Bacicnoohazc MI 3 te 127 
gi|2 1 703 1 1 |dbj|E02067]E02067 [217031 1 j 

(View GenBank repon.FASTA rcpon.ASNM repon, or Graphical view) 
J02467 

Bacteriophage MS2, comDlete genome 
gi|215232|gb|J02467|MS2CG [215232] 

(View GenBank repon.FASTA repoaASNM report, Graphical view,8 MEDLINE links, 4 protein links, 20 nucleotide neighbors 
or 1 genome link ) - 

AJ004950 
Bacteriophage PI ban gene 
gi!3688226|emb!AJ011592|BP1011592 (36882261 

(View GenBank repon.FASTA repon.ASN. 1 rep on, Graphical view, or 1 protein link ) 
U88974 

Bacteriophage PI structural lytic transgiycosylase (orf47), pep44b (orf44b), 

pep44a (orf44a), and pep43 (orf43) g:cts, comolete cds; and pep42 (orf42) gene, panial cos 

gi|266I099|gblAF035607|AF035607 [2661099] 

(View GenBank report,FASTA repon,ASN. 1 repon,Graphical view,5 protein links, or 1 nucleotide neighbor ) 

AJ000741 
Bacteriophage P 1 darA operon 
gi|2462938|cmb|AJ000741|BPAJ7641 [2462938] 

(View GenBank repon,FASTA reponASN. 1 repon,Graphical view, 1 MEDLINE link, 10 protein links, or 3 1 nucleotide neighbors 
X01828 

Bacteriophage PI recombinase gene cis 
gi|15I33|emb|X01828|MYPlCIN [15133] 

(View GenBank repon,FASTA report^SN.l report,Graphical view.l MEDLINE link, 1 protein link, or 3 nucleoride neighbors ) 
X98146 

Bacteriophage PI DNA sequence around the Op88 operator 
gi|1359513|emb|X98l46|BP10P880P [1359513] 

(View GenBank report,FASTA report^.SN.l report, Graphical view, or 1 nucleoride neighbor ) 
S61175 

imml operon: icd^ell division represser, antl=antireprcssor (promoters 
P51a, P51b} [bacteriophage PI, Genoc::, 728 nt] 
gi|385908|gb|S61 175JS61175 [3859081 

(View GenBank repon,FASTA reporuASN.I report, Graphical view.l MEDLINE link, or 3 nucleotide neighbors ) 

X87824 
Bacteriophage PI gene 26 
gi|861164Jemb|X87824|XXBPlG26 [861164] 

(View GenBank repon.FASTA reportASN.l report,Graphical view, or 1 protein link ) 
XI 5638 

Phage PI DNA for lytic replicon containing promoter P53 and two ooen reading frames 
gi|I5735|emb|X15638|PPlLR£P [15735] 

(View GenBank report,FASTA reportASN.l report,Graphical view.l MEDLINE link, 3 protein links, or 24 nucleoride neighbors 
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XI7512 

Bacteriophage Pi DNA for immunity reaton imrnl 
gi| 1 54 79|emb|X 175I2JPI IMMUNIY [ 1 54 79] 

(View GenBank rcport,FASTA rcportASN.l report,Graphical vicw,2 MEDLINE links, or 4 nucleotide neighbors ) 
XI 6005 

Bacteriophage PI cl gene forPlcl repressor protein 
gi|15477|emb|XI6005|PlCI [15477] 

(View GenBank report, FASTA report,ASN.l report,Graphical view, 1 MEDLINE link, 1 protein link, or 3 nucleotide neighbors ) 
X03453 

Bacteriophage PI ere gene for recombinase protein 
gi[15135|emb|X03453|MYPICRE [15135] 

(View GenBank report,FASTA repon.ASN.1 report,Graphica) view.l MEDLINE link, 2 protein links, or 12 nucleotide neighbors : 
X06561 

Bacteriophage PI cl gene 5'«region 
gi|15128|emb|X06561|MYPlCl [15128] 

(View GenBank report^ASTA repooASN.l report,Graphical view, i MEDLINE link, 4 protein links, or 6 nucleotide neighbors ) 
V01534 

Bacteriophage PI genome fragment (IS2 insertion spot). This regions contains 

four unidentified reading frames and is known as insertion hot spot for IS2 insertion sequences 

gi|l5U8|emblV01534|MYOVPl [15118] 

(View GenBank report,FASTA rcportASN.l report,Graphical view.l MEDLINE link, 4 protein links, or 3 nucleotide neighbors ) 

X56951 
Bacteriophage PI genelO 
gi|406728|etnb|X569511BPPlGPI0 [406728] 

(View GenBank report^ASTA repor^ASN.i report,Graphical view,2 MEDLINE links, 3 protein links, or 1 nucleotide neighbor ) 
K02380 

Bacteriophage P 1 replication region including rep A, parA, and parB genea and 
incA, incB, and incC incompatibility determinants 
gi|215652|gb{K02380|PPIREP [215652] 

(View GenBank report^ASTA reponVASN.l report,Graphical view,5 MEDLINE links, 4 protein links, or 8 nucleotide neighbors ) 
X87674 

Bacteriophage PI lydA & lydB genes 
gi|974763|emb|X87674rBACPlLYD [974763] 

(View GenBank report,FASTA reporiASN.l report,Giaphical view, I MEDLINE link; 2 protein links, or 2 nucleotide neighbors ) 

X87673 
Bacteriophage PI gene 17 
gi|97476lJemb|X87673|BACPIi7 [974761] 

(View GenBank reportFASTA report^SN.l report,Graphical view.l MEDUNE link, I protein link, or 1 nucleotide neighbor ) 
M16618 

Bacteriophage PI cl repressor binding sites 
gi|215600jgb|M16618[PPlCl [215600] 

(View GenBank reporuFASTA repon^SN.l report,Graphical view, 1 MEDLINE link, 2 protein links, or 3 nucleotide neighbors ) 
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SEGJ>P1CIN 

Bacteriophage PI cin gene encoding recombinase, cixL recombination site, and 5' end of C invenible clement 
gil215607|gb||SEG_PPICIN [215607] 

(View GenBank report,FASTA reporvASN.l report,Graphical view,l MEDLINE link, I protein link, or 4 nucleotide neighbors ) 
K03173 

Bacteriophage PI C invertible element, right end, and cixR recombination site 
gij2 1 5606|gb|K03 1 73IPP 1 CIN2 [2 15606] 

(View GenBank reportJASTA reporvASN.l report, or Graphical view) 
215605 

Bacteriophage PI cin gene encoding recombinase, cixL recombination site, and 5' end of C invenible element 
gij2l5605flcl|X01828 [215605] 

(View GenBank report,FASTA reporvASN.l report, or Graphical view) 
M25470 

Bacteriophage PI tail fiber protein gene, complete cds 
gi|34l349|gb|M25470|PPlTFPR [341349] 

(View GenBank report,FASTA reporvASN.l reporvGraphical view.l MEDLINE link, 3 protein links, or 3 nucleotide neighbors ) 
M34382 

Bacteriophage P 1 sim region proteins, complete cds 
gi!215661|gb|M34382|PPlSIM [215661] 

(View GenBank report,FASTA reporvASN.l reporvGraphical view,l MEDLINE link, or 2 protein links ) 
M81956 

Bacteriophage P 1 R protein (R) gene, complete cds 
gi|215658|gb|M8l956|PPlRP [215658] 

(View GenBank reporvFASTA reporvASN.l reporvGraphical view,l MEDLINE link, 2 protein links, or 4 nucleotide neighbors j 
M37080 

Bacteriophage PI mini-Pi plasmid origin of replication 
gi)2l5657!gb|M37080|PPlREPOR [215657] 

(View GenBank reporvFASTA reporvASN.l report,Grapbical view, 1 MEDLINE link, or 46 nucleotide neighbors ) 
M27041 

Bacteriophage PI ref gene, complete cds 
gi|2 15650|gb|M2704 1 |PP 1REF [2 15650] 

(View GenBank reporvFASTA reporvASN.l report,Graphical view.l MEDLINE link, I protein link, or 1 nucleotide neighbor ) 
L01408 

Bacteriophage P 1 partition protein (paxB) gene, 3' end • 
gi|215642|gb|L01408|PPIPARB [215642] 

(View GenBank reporvFASTA reporvASN.l report,Graphical view,l protein link, or 41 nucleotide neighbors ) 

SEG.PP1PAR 
Bacteriophage rruniplasmid PI parA gene, 5* end 
gi|215639fgb||SEG PP1 PAR [215639] 

(View GenBank reporvFASTA reporvASN.l rcporvGraphicai view.l MEDLINE link, 2 protein links, or 48 nucleotide neighbors ) 
M36425 

Bacteriophage miniolasmid PI parB gene, 3' end 
gi|2l5638|gb|M36425|PP!PAR2 [215638] 

(View GenBank reporvFASTA reporvASN.l report, or Graphical view) 
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Bacteriophage rniniplasmid PI parA gene, 5' end 
g ij2 1 5637|gb|M36424|PP t PARI [2 15637) 

(View GenBank report,FASTA report,ASN.l report, or Graphical view) 
M11U9 

Bacteriophage PI rniniplasmid origin of replication region 
gi|215632!gb|Ml 1 129|PP10PJM [215632] 

(View GenBank report,FASTA rcport,ASN.l report,Graphical view.l MEDLINE link, 1 protein link, or 43 nucleotide neighbors ) 

^» 

M25414 

Bacteriophage Pi cl repressor binding site, operator 88 (Op88) 
gij2I5631|gb|M25414|PP10P88A (215631] 

(View GenBank report,FASTA report.ASN.l repon,Graphical view,l MEDLINE link, or 3 nucleotide neighbors ) 
M25413 

Bacteriophage PI cl repressor binding site, operator 68 (Op68) 
gi|215630)gb)M254 1 3IPP1 OP68A [215630] 

(View GenBank report,FASTA report,ASN.l report,Graphical view, or 1 MEDLINE link ) 
M254I2 

Bacteriophage Pic] repressor binding site, operator 2 1 (Op21) 
gi|215629igb|M254121PP10P21A [215629] 

(View GenBank report,FASTA repon^ASN.l repon, Graphical view,l MEDLINE link, or 1 nucleotide neighbor ) 
M 105 tO 

Bacteriophage PI recombination site loxR 
gi|215628|gb|M10510|PPlLOXR [215628] 

(View GenBank rcport^ASTA report^ASN.l report.Graphical view, I MEDLINE link, or 1 nucleotide neighbor ) 
MI0287 

Bacteriophage PI loxP X loxP rccombinarion site 
gi!2l5627|gb|Ml0287|PPlLOXPX [215627] 

(View GenBank report^ASTA report^SN.l report, Graphical view.l MEDLINE link, or 13 nucleotide neighbors ) 
M10494 

Bacteriophage PI recombination site loxP 
gtl215626(gb|Ml0494|PPlLOXP [215626] 

(View GenBank repon,FASTA repor^ASN. 1 rcport,Graphical view, 1 MEDLINE link, or 134 nucleotide neighbors ) 
M10511 

Bacteriophage P 1 recombination site ioxL 
gq215625|gb[M10511|PPlLOXL [2156251 

(View GenBank repooFASTA rcport^SN.l report,Graphical view,l MEDLINE link, or 1 nucleotide neighbor ) 
M10512 

Bacteriophage Pi recombination site loxB 
gi|215624|gb|M10512(PPlLOXB [215624] 

(View GenBank reportFASTA report^SN. 1 report, Graphical view, or 1 MEDLINE link ) 
M10145 

Bacteriophage PI senome fragment with recombination site loxP 
gi|2 1 5623|gb|M 1 0 1 45|PP 1 CREX [2 1 5623] 

(View GenBank rcport,FASTA report^SN.l report,Grapbical view,l MEDLINE link, or 21 nucleotide neighbors ) 
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MI3327 

Bacteriophage PI Cin recombinase activated cross over site, junction IV, clone pSHI326 
gi]21 5622|gbiM 13327|PP1 CN26IV [2 1 5622) 

(View GenBank report.FASTA report.ASN.l repDrt.Grapbical view, 1 MEDLINE link, or 7 nucleotide neighbors ) 
M13325 

Bacteriophage PI Cin recombinase activated cross over site, junction II, clone pSHI326 
gi|215621|gb|M13325|PPlCN26II [215621] 

(View GenBank report.FASTA report.ASN.l report,Grapnical view.l MEDLINE link, or 1401 nucleotide neighbors ) 
M13323 

Bacteriophage PI Cin recombinase activated cross over site, junction IV, clone pSHI325 
gi|2 1 5620|gb|M 1 33 23 [PP 1 CN25I V [2 1 5620] 

(View GenBank report.FASTA report.ASN. 1 repon.Graphical view.l MEDLINE link, or 7 nucleotide neighbors ) 
M13321 

Bacteriophage PI Cin recombinase activated cross over site, juncdon II, clone pSHI325 
gi|215619|gb|M13321|PPlCN25II (215619) 

(View GenBank report,FASTA report^SN. 1 report,Graphical view.l MEDLINE link, or 1058 nucleotide neighbors ) 
M13324 

Bacteriophage PI Cin recombinase activated cross over site, juncdon I, clone pSH1326 
gi}2156l8|gbjMi3324}PPICHU6I (215618] 

(View GenBank repooFASTA report^ASK 1 report, Graphical view, 1 MEDLINE link, or 7 nucleotide neighbors ) 
M13319 

Bacteriophage PI Cin recombinase activated cross over site, right junction, clone pSHI327 
gi|215617jgb|M 133 19JPP1CIN27R [215617] 

(View GenBank report,FASTA repor^ASN.l report, Graphical view, I MEDLINE link, or 7 nucleotide neighbors ) 
M 13320 

Bicteriophage PI Cin recombinase activated cross over site, junction 1, clone pSHI325 
gi|215616|gb|M13320IPPlCIN25I [215616] 

(View GenBank reportJFASTA repooASN.l report,Grapaical view.l MEDLINE link, or 7 nucleotide neighbor* ) 
M13318 

Bacteriophage PI Cin recombinase activated cross over site, left junction, clone pSHI324 
giJ215615|gb|M133l8|PPlCIN24L [215615] 

(View GenBank repooFASTA report^SN. 1 report, Graphical view.l MEDLINE link, or 1370 nucleotide neighbors ) 
M13317 

Bacteriophage P I Cin recombinase activated cross over site, right junction, clone pSHD23 
gi|2l56l4{gblMl33l7lPPlCIN23M [215614] 

(View GenBank report^ AST A reportASN.l report,Grapbical view, 1 MEDLINE link, or 1055 nucleotide neighbors ) 
M13316 

Bacteriophage PI Cin recombinase activated cross over site, left junction, clone pSHI323 
gq2156l3|gb(M13316|PPlCIN23L [215613] 

(View GenBank reportJASTA repor^ASN.l report,Craphical view.l MEDLINE link, or 7 nucleotide neighbors ) 
M13315 

Bacteriophage PI Cin recombinase activated cross over site, right junction, clone pSHI322 
gi|215612fgb|M13315|PPlCIN22R [215612] 

(View GenBank report^ ASTA report^ASN.l repon,Graphical view, 1 MEDLINE link, or 7 nucleotide neighbors ) 
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Bacteriopbage PJ Cin recombinase activated cross overbite, left juncnoa, clone pSHI322 
gij2156U|gb|M13314|PPlCIN22L [215611] 

(View GenBank repoaFASTA report,ASN.l report,Graphical view,I MEDLINE link, or 1401 nucleotide neighbors ) 
M13313 

Bacteriophage PI Cin recombinase activated cross over sire, right junction, clone pSHI321 
gil2156l0|gb|M13313|PPlCIN2IR [215610] 

(View GenBank report,FASTA report,ASN. 1 report,Graphical view, 1 MEDLINE link, or 7 nucleotide neighbors ) 
M13312 

Bacteriophage PI Cin recombinase activated cross over site, left junction, clone pSHI321 
gi|2 1 5609|gb|M 1 3 3 1 2|PP ICIN2 1 L [2 1 5609] 

(View GenBank report,FASTA repon,ASN.l repon,Graphical view.l MEDLTNE link, or 1058 nucleotide neighbors ) 
Ml 6568 

Bacteriophage P 1 c4 repressor gene, complete cds 
gi|2 1 5603|gb|M 1 6568|PP 1 C4 [2 1 5603] 

(View GenBank report,FASTA report^ASN.l report,Graphical view.l MEDLINE link, 1 protein link, or 4 nucleotide neighbors ) 
M 13326 

Bacteriophage PI Cin recombinase activated cross over site, junction III, clone pSHI326 
gi|215602tgb|M13326|PPlC26m [215602] 

(View GenBank report,FASTA repooASN. 1 repon,Graphical v l ew.l MEDLINE link, or 1 192 nucleotide neighbors ) 
M 13322 

Bacteriophage PI Cin recombinase activated cross over site, junction III, clone pSHD25 
.. gi|2l5601|gb|MI3322|PPlC25m [215601] 
(View GenBank report^ASTA rcporvASN.l report,Graphieal view.l MEDLINE link, or 1231 nucleotide neighbors ) 

J05651 

Bacteriophage PI modulator protein (bof) gene, complete cds 
gi|215598[gb|J05651|PPlBOFYl [215598] 

(View GenBank report^ AST A report^SN.l rcport,Graphical view,l MEDLINE link, I protein link, or 3 nucleotide neighbors ) 
MJ3224 

Bacteriophage PI regulatory protein (bof) gene, complete cds 
g02l5596|gb[M33224|PPlBOFFO [215596] 

(View GenBank repon^ASTA report^SN.l report,Graphical view.l MEDLINE link, 1 protein link, or 3 nucleotide neighbors ) 
M10288 

E.coli/bacteriophage P 1 loxR recombination site 
gif l46647|gb|M10288|ECOLOXR [146647] 

(View GenBank repon^ASTA report r ASN.l report,Graphical view.l MEDLINE link, or 3 nucleotide neighbors ) 
M 10289 

E.coli/bacteriophage PI loxL recombination site 
gi|J46646|gblMl0289}ECOLOXL (146646) 

(View GenBank report^ASTA report^SN.l report,Grapbical view.l MEDLINE link, or 2 nucleotide neighbors ) 
M 10290 

Ecoli loxB site, which can recombinc with bacteriophage PI loxP site 
gi| i 46645|gbJM 1 0290|ECOLOXB [ 1 46645] 

(View GenBank report,FASTA report^SN.l report,Graphical view.l MEDLINE link, or 2 nucleotide neighbors ) 
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) 



M 10287 

Bacteriophage PI loxP X loxP recombination site 
gi|2 1 5627|gb|M 1 0287|PP 1 LOXPX [2 15627] 

(View GenBank report.FASTA report,ASN.l report,Graphical vicw.l MEDLINE link, or 13 nucleotide neighbors ) 
M74046 

Bacteriophage PI pacA and pacB genes, complete cds 
gi|2 1 5634|gb|M74046|PP 1 PACAB [2 1 5634) 

(View GenBank report.FASTA repon,ASN.l report,Graphical view.l WEDLINE link, or 2 protein links ) 
M95666 

Bacteriophage PI gene 10, doc and phd genes, complete cds 
gi|463276|gb|M95666|PPlPHDDOC [463276] 

(View GenBank report.FASTA reportASN.i report,Graphical vicw.2 MEDLINE links, 4 protein links, or 1 nucleotide neighbor 
M25604 

Bacteriophage Q-beta mutated autonomously replicating sequence MDV1 RNA fragment 
gi|556359jgb|M25604|PQBARSMUT [556359] 

(View GenBank report,FASTA reporVASN.l report, Graphical view.l MEDLINE link, or 8 nucleotide neighbors ) 
V00643 

first half of the phage Q-beta gene for coat protein 
gi|15088|emblV0O643|LEQBET [15088] 

(View GenBank report,FASTA reportASN.i report,Graphical view.l MEDLINE link, 1 protein link, or 4 nucleotide neighbors 
M25167 

Bacteriophage Q-beta RNA fragment recovered from replicase binding complex 
gi|556362|gb|M25167|PQBREPUCB [556362] 

(View GenBank report,FASTA rcpor^ASN.l report,Graphical view.l MEDLINE link, or 2 nucleotide neighbors ) 
M24876 

Bacteriophage Q-beta replicase RNA, 5' end 
gi|556360|gb|M24876|PQBREPUCA [556360] 

(View GenBank report,FASTA reportASN.i reporvGraphical view.l MEDLINE link, 1 protein link, or 4 nucleotide neighbors ) 
M25444 

Synthetic bacteriophage Q-beta DNA 
gi|209118tgb|M25444|SYNPQBTERMf209118] 

(View GenBank report^ASTA reportASN. 1 report.Graphical view, 1 MEDLINE link, or 8 nucleonde neighbors ) 
M25463 

Bacteriophage Q-beta self-replicating microvariant (+) RNA 



) 



gi|532489|gb[M25463|PQBMVSRRNA [532489] 
(View GenBank repon^FASTA reportASN.i report,Graphical view, or 1 MEDLINE link ) 



M25014 



Bateriophage Q-beta RNA replicase gene, 5'end, and maturation protein gene, 3' end 
gi|294316|gb|M25014|PQBREPLC [294316] 

(View GenBank report,FASTA reportASN.i report,Graphical view.l MEDLINE link, 2 protein links, or 2 nucleotide neighbors ) 
M25065 

Bacteriophage Q-beta RNA sequence with putative stem loop 

gi|294315|gb|M25065|PQBLOOP [294315] ^ , , . . . . 

(View GenBank report,FASTA reportASN.i report,Graphical view,! MEDLINE link, or 3 nucleonde neighbors) 
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Ml 0265 

Bacteriophage Q-beta RNA molecule with the ability to replicate extracellularly 
gi|215726|gb|MI0265|PQBRNA [215726] 

(View GenBank report.FASTA repon,ASN.l report,Graphical view, I MEDLINE link, or 8 nucleotide neighbors ) 
M24815 

Bacteriophage Q-beta specified replicase subunit RNA, 
gi|215725|gb|M24815|PQBREPL [215725] 

(View GenBank repon.FASTA report,ASN.l report.Graphicai view.l MEDLINE link, or 4 nucleotide neighbors ) 
M25461 

Bacteriophage Q-beta plus-strand RNA, 5' terminus 
gi|2 t5724|gb|M2546 1 |PQBPS5E [2 15724] 

(View GenBank repon,FASTA report.ASN.l report, or Graphical view) 
M25462 

Bacteriophage Q-beta plus-strand RNA, 3' terminus 
gi|215723|gb|M254621PQBPS3E [215723] 

(View GenBank repon.FASTA report^SN.l report,Giaphical view, or 8 nucleotide neighbors ) 
M24871 

Bacteriophage Q-beta nanovariant WSM RNA 
gi|2 1 5722|gb|M2487 HPQBNVWSIC [2 15722] 

(View GenBank repon,FASTA repor^ASN.! rcport,Graphical view, I MEDLINE link, or 2 nucleotide neighbors ) 
M24870 

Bacteriophage Q-beta nanovariant WSH RNA 
giJ2 1 572 1 |gb|M24 870|PQBNVWSra (2 1572 i ] 

(View GenBank repon,FASTA repooASN.l repon,Graphical view,l MEDLINE link, or 2 nucleotide neighbor* ) 
M24869 

Bacteriophage Q-beta nanovariant WSI RNA 
gi]2 ! 5720|gb|M24869|PQBNVWSIA (2 15720] 

(View GenBank report,FASTA reporv\SN.l report,Graphical view.l MEDLINE link, or 2 nucleotide neighbors ) 
M10495 

Coliphage Q-beta MDV-t(+) RNA 

gi|2 1571 9|gb|M 1 0495JPQBMDVI A [2 1 57 1 9] 

(View GenBank report,FASTA report,ASN.l report,Graphical view.l MEDLINE link, or 10 nucleotide neighbors ) 
J02484 

bacteriophage qbeta coat protein cistton first half 
gi|2157l7|gbJJ02484|PQBCP5 (215717} 

(View GenBank report^ASTA report^ASN.l report,Graphical view,l MEDLINE link, 1 protein link, or 4 nucleotide neighbors ) 
M57754 

Bacteriophage Q-beta minus strand RNA, 5* tenninus 
gi|2157i6|gb|M57754|PQBBMS5E [215716] 

(View GenBank report,FASTA rcport^.SN.1 reportGraphical view, or 8 nucleotide neighbors ) 
M24297 

Bacteriophage Q-beta 5'- terminal region of the minus strand 
gi|2 1571 5|gb|M24297|PQB5END [2 157 15] 

(View GenBank report,FASTA reporv\.SN.l report,Graphical view, I MEDLINE link, or 8 nucleotide neighbors ) 
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Bacteriophage Q-beta, MDV- 1 RNA 
gi}2 1 57 1 4jgb|M 1 0695IPQB I IR [215714] 

(View GeaBank repon.FASTA report,ASN.l report.Graphical view,2 MEDLINE links, or 12 nucleotide neighbors ) 
M24827 

Bacteriophage R17 A protein gene, 5' end 
g ii2l6078!gb|M24827iRl7RNACIS [216078] 

(View GenBank repon.FASTA report^SN.l report, Graphical view, I MEDLINE link, or 5 nucleotide neighbors ) 
M24829 

Bacteriophage R 1 7 coat protein gene, 5' end 
g\!216075|gb)M24829IRl7CP5 [216075} 

(View GenBank repon,FASTA rcport,ASN.l report.Graphical vicw.l MEDLINE link, or 5 nucleotide neighbors ) 
J02488 

bacteriophage rl7 ma synthetase initiation site 
gi|2l6080|gb|J024881R17RNASYN (216080) 

(View GenBank report,FASTA report^SN. 1 report,Graphical view,3 MEDLINE links, 2 protein links, or 6 nucleotide neighbors ) 
J02487 

bacteriophage rl7 coat protein initiation site 
gi|2 1 6073|gb|J02487|R 1 7COATP [2 1 6073 ] 

(View GenBank reportFASTA reportASN.l report,Graphical view, or 1 MEDLINE link ) 
J02486 

bacteriophage rl7 a protein initiation site 
gi|2 1 607 1 |gb|J02486|Rl 7 APROT [2 16071 ] 

(View GenBank repon.FASTA reportASN.l report,Graphical view, or I MEDLINE link ) 
M24826 

Bacteriophage R17 coat protein RNA fragment 
gi|2!6077|gb|M24826[R17CPRAA [216077] 

(View GenBank report,FASTA reportASN.I report,Graphical view, 1 MEDLINE link, or 7 nucleotide neighbors ) 
M24296 

Bacteriophage R17 3'-terminal fragment A RNA 
gi|216070|gb|M24296IR173TFA [216070] 

(View GenBank repon^ASTA report^ASN.l report,Graphical view.l MEDLINE link, or 9 nucleotide neighbors ) 
1TFN 

structure refinement for a 24-nucleoride ma hairpin, nmr, minfmir ed average 

structure ribonucleic acid, hairpin, bacteriophage rl7 moljd: 1; molecule: rl7c; chain: null; engineered; yes 
gi|l942336ipdb|lTFNl [1942336] 

(View GenBank reporter ASTA reportASN.I rcport,Graphical view, or 1 structure link ) 
IRPEA 

ma (5'-d(gpgpgpapcpupgpapcpgpapupcpapcpgp cpapgpupcpupapu-3*) (24-mer ma 
hairpin coat ptotein binding site for bacteriophage rl7) (nmr, minimized average structure) 
gi|1421020|pdb|lRHTt [1421020] 

(View GenBank reportFASTA reporXASN.l report,Graphical view, or 1 structure link ) 
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MI 4428 

Bacteriophage S 1 3 circular DNA, comolete genome 
gi|2 1 6089|gb|M 1 4428|S 13CG [21 6089] 

(View GenBank report.FASTA report,ASN. 1 report,Graphical view,2 MEDLINE links, 12 protein links, 26 nucleotide neighbors, 
or 1 genome link ) 

J05393 

Bacteriophage T 1 DNA N-6-adcnine-methyltransferase (M.T1 ) gene, complete cds 
gi|166163|gb|J05393|BTlNAMTA [166163] 

(View GenBank report,FASTA report.ASN.1 report,Graphical view,l MEDLINE link, or 2 protein links ) 
L46845 

Bacteriophage T2 frd3, frd2 genes, comolete cds 
gi|95 1387|gb|L46845|PT2FRD32G [95 1387] 

(View GenBank repon.FASTA report.ASN.1 report,Graphical view,2 protein links, or 17 nucleotide neighbors ) 
L436U 

Bacteriophage T2 fibririn (wac) gene, complete cds 
gi!903869|gb!L436U|PT2WAC [903869] 

(View GenBank report.FASTA reporvASN.l report,Graphical view, I protein link, or 4 nucleotide neighbors ) 
M24812 

Bacteriophage T2 secondary structure RNA sequence 
gi|215796|gb|M248I2|FT2RNA [215796] 

(View GenBank report,FASTA report^SN.l report,Graphical view, I MEDLINE link, or 4 nucleotide neighbors ) 
M22342 

Bacteriphage T2 DNA-(aderune-N6)niethyltransferase (dam) gene, complete cds 
gij215792|gb|M22342|PT2DAM (215792] 

(View GenBank rcponJFASTA report^ASN. I report,Graphical view.l MEDLINE link, 1 protein link, or 2 nucleotide neighbors ) 
S57515 

orf 61 .2 {intergenic region between 41 and 61 } [bacteriophage T2, Genomic, 323 nt] 
gi|298524|gb|S575l5|S57515 [298524] 

(View GenBank report,FASTA reporuASN.l report,Graphical view.l MEDLINE link, or 1 protein link ) 
X05312 

Bacteriophage T2 gene 38 for receptor recognizing protein 
gi|15197|emb!X05312|MYT2G38 [15197] 

(View GenBank reportJFASTA report^SN.l report,Graphical view.l MEDLINE link, or 1 protein link ) 
X04442 

Bacteriophage T2 gene 37 for receptor recognizing protein 
gill5195jemblX04442|MYT2G37 [15195] 

(View GenBank report,FASTA reporvVSN.l report.Graphical view.l MEDLINE link, or 1 protein link ) 
X12460 

Bacteriophage T2 gene 32 mRNA for single-stranded DNA binding protein 

ggi5192|emb|X12460|MYT2G32 [15192] . 
(View GenBank report,FASTA report^ASN.l report,Graphical view.l MEDLINE link, 2 protein links, or 14 nucieoade neighbors ) 

X57797 

Bacteriophage T2 gene for gpl2 

giil4875|emb|X56555|BT2GP12 [14875] . 
(View GenBank report^ASTA report, ASN. 1 report,Graphical view, i protein link, or 2 nude onde neighbors ) 
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X01755 

Bacteriophage T2 tail fiber gene 36 

SU 151 89lembfX0l 755|MYT2F36 [15189] 

(View GenBank report,FASTA report,ASN.l report,Graphical view.l MEDLINE link, 2 protein links, or 1 nucleotide neighbor ) 
M 14784 

Bacteriophage T3 strain amNG220B right end, tail fiber protein, lysis protein and DNA packaging proteins, complete cds 
gi|2 1 58 10!gb|M 1 4784|PT3RE [2 158 10] 

(View GenBank reportjASTA reportj\SN.l report,Graphical view.l MEDLINE link, 9 protein links, or 10 nucleotide neighbors ) 

S EG_PT3 RN APOL 
Bacteriophage T3 RNA polymerase III gene, 5' end 
gi(7 1 0559|gb|[SEG_PT3RNAPOL [7 10559] 

(View GenBank report,FASTA report.ASN.1 report.Graphkal view.l MEDLINE link, 2 protein links, oi 2 nucleotide neighbors ) 
M22610 

Bacteriophage T3 RNA polymerase III gene, 3' end 

gi|340722|gb|M2261 0IPT3RNAPOL2 [340722] 

(View GenBank reporvFASTA reportJ\SN.l report, or Graphical view) 

M22609 

Bacteriophage T3 RNA polymerase in gene, 5' end 

gi]340721!gb|M22609|PT3RNAPOLl [340721] 

(View GenBank reportJASTA reportJ\SN.l report, or Graphical view) 

X05031 

Bacteriophage 73 gene region 1*2.5 with primary origin of replication 
gqi57l9|emb|X05031|POT3ORI [15719] 

(View GenBank report^ASTA report,ASN.I repon,Graphical view.l MEDLINE link, 1 1 protein links, or 5 nucleotide neighbors ) 
X03964 

Bacteriophage T3 early control region pos. 308*810 from genome left end 
gi|157l8!emb|X03964|POT3EP [15718] 

(View GenBank repooFASTA rcport^ASN.i report,Graphical view,2 MEDLINE links, or 20 nucleotide neighbors ) 
X17255 

Bacteriophage T3 gene 1 to gent 11 
gi|l5682|emb|X17255|POT3111G [15682] 

(View GenBank rcport^ASTA reporVASN.l report,Graphical view,4 MEDLINE links, 36 protein links, 17 nucleotide neighbors, 
or 1 genome link ) 

XI 5840 
Phage T3 gene 10 

gi|15625|emb|X15840}PODT3G!0 (15625] 

(View GenBank repooFASTA rcporVASN.l report,Grapbical view.l MEDLINE link, or 3 nucleotide neighbors ) 
X02981 

Bacteriophage T3 gene 1 for RNA polymerase 
gi|15561|emb|X02981|PODOT3P [15561] 

(View GenBank ieport,FASTA report,ASN.l report,Graphical view.l MEDLINE link, 1 protein link, or 3 nucleotide neighbors ) 
J02503 

bacteriophage t3 5' end, terminally redundant sequence (trs) 
gi|215816|gb|J02503|PT3TRSl [215816] 

(View GenBank ieport,FASTA reporUASN. 1 report, or Graphical view) 
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SEG_PT3TRS 

bacteriophage G 5' end, terminally redundant sequence (trs) 
gi|2 1 58 1 8|gb|JSEG_PT3TRS [2 1 58 1 8 J 

(View GenBank report.FASTA report,ASN.l report,Graphical view, or 1 MEDLINE link ) 
J02504 

bacteriophage t3 3' end, terminally redundant sequence (trs) 
gi|2I58l7|gb|J025O4|PT3TRS2(215817J 

(View GenBank report.FASTA report.ASN.1 report, or Graphical view) 

HYPERLINK http://www.TS.noda.sut.ac.jp/--kuaisawa h t tp://www.rs. no da.sm.ac.jp/~kunisawa 
Bacteriophage T4 genomic database compiled by Arisaka et aL 

X95646 

Bacteriophage T5 DNA for region 60.5%-71% of the T5 genome 
gi|279l557|emb|AJOO119l|BTJ001l91 [2791557] 

(View GenBank report,FASTA reportASN.l report,Grapbical view,7 MEDLINE links, 12 protein links, or 6 nucleotide neighbors ) 
X56847 

Bacteriophage T5 genomic region encoding early genes D10-D15 
git 1 5407|emb|X 1 29301MYT5D 1 0 [ 15407] 

(View GenBank report t FASTA reporVASN.l report,Graphical view, 1 MEDLINE link, 5 protein Links, or 4 nucleotide neighbors ) 
AF039886 

Bacteriophage T5 subclone T5.5.3r5. 18r, single pass sequence, genomic survey sequence 

gi(28 1 1 154|gb|AF039886|AF039886 [28 1 1 154] 

(View GenBank reportfASTA reportASN.l report, or Graphical view) 

AF0398S5 

Bacteriophage T5 subclone T5.40f,41f, single pass sequence, genomic survey sequence 

gi{28 11 153|gb|AF039885|AF03988S (28 II 153] 

(View GenBank reporttFASTA repcrtASN.l report, or Graphical view) 

AF039884 

Bacteriophage T5 subclone T5.26.fr, single pass sequence, genomic survey sequence 

gil2811l521gblAPO39884|AF039884 (2811152) 

(View GenBank report,FASTA report,ASN.l report, or Graphical view) 

AF039883 

Bacteriophage T5 subclone 10-T5.5 JF, single pass sequence, genomic survey sequence 

gi|28 11 15 1 |gb|AF039883|AJF039883 [28 1 115 1] 

(View GenBank reportFASTA report^ASN.l report, or Graphical view) 

AF039882 

Bacteriophage T5 subclone 41-T5.5.4BF, single pass sequence, genomic survey sequence 

gi|28U 1 50|gbjAF039882lAF039882 [281 1150] 

(View GenBank reportJASTA repoi^ASN.l report, or Graphical view) 

AF039881 

Bacteriophage T5 subclone 39-T5.5.4aF, single pass sequence, genomic survey sequence 

gij281 1 149|gb|AF03988 1|AF03988! [28 1 1 149] 

(View GenBank rcport,FASTA reporVASN.l report,Graphical view, or 1 

nucleotide neighbor ) 
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AF039880 

Bacteriophage T5 subclone 19-T5.7.2r, single pass sequence, genomic survey sequence 

gi|28l I M8fgb|AF039880|AF039880 [28 li 148) 

(View GcnBank report.FASTA report,ASN.l report, or Graphical view) 

AF039879 

Bacteriophage T5 subclone J8-T5.7.2F. single pass sequence, genomic survey sequence 

gi|28tll47|gb|AP039879|AF039879 12811147] 

(View GenBank report.FASTA report^SN.l report, or Graphical view) 

AF039878 

Bacteriophage T5 subclone 1 1-T5.5.7R, single pass sequence, genomic survey sequence 
gi|281 H46|gbiAF039878|AF039878 (2811146] 

(View GenBank report,FASTA report,ASN.l report,Graphical view, or 2 
nucleotide neighbors ) 

AF039877 

Bacteriophage TS subclone T5.4FR, single pass sequence, genomic survey sequence 

gij2811 145|gb|AF039877|AF039877 [2811 145] 

(View GcnBank repooFASTA repoOASN.l report, or Graphical view) 

AF039876 

Bacteriophage T5 subclone 22-T5.16R. single pass sequence, genomic survey sequence 

gi|2811144|gb|AJ039876|AF039876 [2811144] 

(View GcnBank repooFASTA repoOASN.l report, or Graphical view) 

AF039875 

Bacteriophage TS subclone 21-TS. 16R, single pass sequence, genomic survey sequence 

gi|28U l43|gb|AF039875|AJF03987S [281U43] 

(View GenBank report, FAST A repoOASN.l report, or Graphical view) 

AF039874 

Bacteriophage T5 subclone 21-T5.16T, single pass sequence, genomic survey sequence 

giJ2811142Jgb|A5039874|AF039874 [281U42J 

(View GenBank repooFASTA repoOASN.l report, or Graphical view) 

AF039873 

Bacteriophage TS subclone 09-T5.6T, single oass sequence, genomic survey sequence 

gi|281ll41[gb|AF039873|AF039873 [2811141] 

(View GenBank repooFASTA repoOASN.l report, or Graphical view) 

AF039872 

Bacteriophage T5 subclone 09-T5.6R, single pass sequence, genomic survey sequence 
gi|28U1401gb!AF039872JAF039872 (2811140) 

(View GenBank repooFASTA report T ASN. L rcport,Graphical view, or 2 nucleotide neighbors ) 
AF03987I 

Bacteriophage T5 subclone 04-T5.26.R, single pass sequence, genomic survey sequence 

gi|28 1 1 139tgb|Af 03987 t|Af 039871 [28 1 1 139] 

(View GenBank repooFASTA repoOASN.l report, or Graphic*} view) 

AF039870 

Bacteriophage TS subclone 13-T5.42F, singie pass sequence, genomic survey sequence 

gi|28 1 1 138|gblAP039870JAF039870 [28 1 1138] 

(View GenBank repooFASTA repooASK i report, or Graphical view) 



WO 00/32825 



PCT/IB99/02040 



223 

X69460 

Bacteriophage T5 Itf gene for L-shaped tail fibers 
gi|154I5|emb!X69460|MYT5LTF [I5415J 

(View GenBank report.FASTA report,ASN.l report,Graphical view,2 MEDLINE links, 1 protein link, or 4 nucleotide neighbors ) 
X03402 

Bacteriophage T5 D15 gene for 5' exonudease 
gi|15413|emb|X03402|MYT5EXOG [15413] 

(View GenBank report.FASTA report^ASN.l report,Graphical view.l MEDLINE link, 1 protein link, or 2 nucleotide neighbors ) 
2U972 

Bacteriophage T5 tRNA-Tyr, tRNA-Glu, tRNA-Trp, iRNA-Phe, tRNA-Cys and 
tRNA-Asn genes, and ORFs 91aa, 90aa, 42 aa and 172aa 
gi|l5795|emb|2H972|T56TRNAG [15795] 

(View GenBank report,FASTA report^SN.l report.Grapb.ical view, I MEDLINE link, 4 protein links, or 3 nucleotide neighbors ) 
X03898 

Bacteriophage T5 genes for tRNA-His, -Ser and -Leu 
gi|15786|emb|X03898|STT5RNl [15786] 

(View GenBank report.FASTA reportyASN. 1 report,Graphical view, or 2 MEDLINE links ) 
X04177 

Bacteriophage T5 gene for transfer RNA-Gln 
gi|15421!emb|X04177!MYT5TRNQ [15421] 

(View GenBank report,FASTA report*ASR 1 report,Grapbical view,l MEDLINE link, or 2 nucleotide neighbors ) 
X03899 

Bacteriophage T5 genes for tRNA-VaL -Lys, -fMet, -Pro and -Ile3 
gi|15?87|emb|X038991STT5RN2 [15787] 

(View GenBank repon,FASTA reporuASN. I report,Grapbical view, or t MEDLINE link ) 
X03793 

Bacteriophage T5 gene for tRNA-Asp (GUC) 
gi|15472|emb|X03798(NCT5TRDG [15472] 

(View GenBank report,FASTA report r ASN. I report,Grapbical view, I MEDLINE link, 2 protein links, or 2 nucleotide neighbors ) 
Y00364 

Bacteriophage T5 tRNA gene cluster (27.8%-22.4%) 
giJI5420|emb|y00364|MYT5TRN [15420] 

(View GenBank report,FASTA reportASN.l repoaGrapbical view.l MEDLINE link, or 13 nucleotide neighbors ) 
X03140 

Bacteriophage T5 DNA with rho- dependent transcription terminator (Hind ETJ-P fragment) 
gill54171emb|X03140|MYT5RHO [15417] 

(View GenBank report^ ASTA reportASN.l report,Grapbical view.l MEDLINE link, 2 protein links, or 2 nucleotide neighbors ) 

Z35070 
Bacteriophage T6 DNA 

gil535228|emb(Z35074lMYEREGBT6 [535228] 

(View GenBank report,FASTA leport^SN.l report,Gtapbical view.l MEDLINE link, or I protein link ) 
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AF060870 

Coliphage T6 small subunit distal tail fiber (gene 36) gene, partial cds; and large subunit distal tail fiber (gene 37) and tail fiber 
adhesin (gene 38) genes, complete cds 
gi|3676458|gb|AF052605|AF 052605 (3676458) 

(View GenBank repon,FASTA report^ASN.l repon,Graphical view.3 protein links, or 2 nucleotide neighbors ) 
235072 

Bacteriophage T6 DNA encoding ORF19.1 gene and gl9 gene 
gi|535232|ernb|Z35072|MYTAILT6 (535232) 

(View GenBank repon.FASTA report,ASN.l report,GraphicaI view, I MEDLINE link, or 2 protein links ) 
X12488 

Bacteriophage T6 gene 32 mRNA for single -stranded DNA binding protein 
gifl5843|emb|X12488|MYT6G32 (15843) 

(View GenBank repor^FASTA rcport^ASN.l report,Grapbical view, I MEDLINE link, 1 protein link, or 14 nucleotide neighbors ) 
Z78095 

Bacteriophage T6 DNA (1506 bp) 
gt|1488562|emb|Z78095JBPHZ78095 (1488562) 

(View GenBank report^FASTA reporCASN. 1 report,Grapbical view, 1 protein link, or 4 nucleotide neighbors ) 
Z35079 

Bacteriophage T6 DNA for Ip5, Ip6 
gi|535215|emb|Z35079|MY57BT6 (535215) 

(View GenBank report,FASTA report^ASN.l report,Graphical view.l MEDLINE link, 2 protein links, or 1 nucleotide neighbor ) 
X68725 

E.coli bacteriophage T6 gene forbeu-glucosyl-HMC-alpha-glucosyl-transferase 
gi|296439|emb|X68725|ECT6 (296439] 

(View GenBank report^ A ST A report»ASN. 1 report,Graphical view.l MEDLINE link, 3 protein links, or I nucleotide neighbor ) 
X69894 

Bacteriophage T6 alt gene for ADP-Ribosyltransferase 
gi]l5422|emb|X69894|MYT6ADP (15422) 

(View GenBank repooFASTA reporVASN.l report,Graphical view.l MEDLINE link, 1 protein link, or 1 nucleotide neighbor ) 
L46846 

Bacteriophage T6 frd3. frd2 genes, complete cds 
gi|951390|gb|L46846|PT6FRD32G (951390] 

(View GenBank rcpoixFASTA reporvASN.l report,Graphical view, or 2 protein links ) 
M27738 

Bacteriophage T6 trans lational repressor protein (regA), complete cds 
gi|215993|gb|M277381PT6REOA (215993] 

(View GenBank report^ ASTA report^ SN.l report,Grapbical view, I MEDLINE link, 1 protein link, or 5 nucleotide neighbors ) 
M38465 

Bacteriophage T6 DNA ligase gene, complete cds 
gi|215991]gb|M38465|PT6LIG55 (215991) 

(View GenBank reponjASTA report^N.! report,Graphical view.l MEDLINE link, 1 protein link, or 2 nucleotide neighbors ) 
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V01 146 
Genome of bacteriophage T7 
gi|431 187|emb|V01 146(T7CG [431187] 

(View GenBank repoaFASTA report,ASN.l report.Graphical view, 13 MEDLINE links, 60 protein links, 105 nucleotide 
neighbors, or 1 genome link ) 

X60322 

Bacteriophage alpha3 genes A, B, K, C, D, E, J, F, G, H 
gi|14775|emb|X60322|BACALPHA (14775] 

(View GenBank repon.FASTA report, ASN.l repon,Grapbical view.l MEDLINE link, 10 protein links, 22 nucleotide neighbors, 
or 1 genome link ) 

X13332 

Bacteriophage alpha3 DNA for origin of replication 
gi|I5093|emb|X13332|MlA3ORPL [15093] 

(View GenBank report,FASTA report.ASN.1 repoaGraphical view, or 1 MEDLINE link ) 
XI2611 

Bacteriophage alpha3 gene for protein A part, finger domain 
gi|15092|emb|Xl261 1IMIA3AFIN (15092] 

(View GenBank report,FASTA report.ASN.1 repoaGraphical view, I MEDLINE link, 1 protein link, or 6 nucleotide neighbors ) 
X15721 

Bacteriophage alpha3 deletion mutation DNA for the origin region (-on) of replication 
gi|14774|emb|Xl5721|BA3DMOR9 (14774) 

(View GenBank reportJASTA repoiXASN.l report.Grapbical view.l MEDLINE link, or 1 1 nucleotide neighbors ) 
X15720 

Bacteriophage alphaJ deletion mutant DNA for the origin region (-on) of replication 
gi|14773|cmb|X15720|BA3DMOR8 (14773] 

(View GenBank reportJASTA rcportASN.l repott,Grapbical view.l MEDLINE link, or 1 nucleotide neighbor ) 
XI5719 

Bacteriophage alpha3 insertion mutant DNA for the origin region (-on) of replication 
gi|I4772|emb|X15719|BA3DMOR7 [14772] 

(View GenBank reportFASTA report^SN.l report,Graphical view, I MEDLINE link, or 10 nucleotide neighbors ) 
X15718 

Bacteriophage alpha3 deletion mutation DNA for origin region (-on) of replication 
gi| 1477 l|embpC157 1 8|BA3DMOR6 [14771] 

(View GenBank repon^ASTA reporVASN. 1 report,Graphical view, 1 MEDLINE link, or 1 1 nucleotide neighbors ) 
X157I7 

Bacteriophage alpha3 deletion mutatnt DNA for origin region (-on) of replication 
gi|l4770|emb|X15717|BA3DMOR5 [14770] 

(View GenBank reporuFASTA report^SN.l report,Graphical view.l MEDLINE link, or 9 nucleotide neighbors ) 
X15716 

Bacteriophage aipha3 deletion mutant DNA for origin region (-on) of replication 
gi|14769|emb|X15716iBA3DMOR4 [14769] 

(View GenBank report^ASTA rcport*ASN.l report,Grapbical view, 1 MEDLINE link, or 10 nucleotide neighbors ) 
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XI57I5 

Bacteriophage alpha3 deletion mutant DNA for origin region (-on) of of replication 
gi| 14768femb|X 1 57 1 5{B A3 DMOR3 ( 14768] 

(View GenBank report.FASTA report,ASN.l reporter aphical view.l MEDLINE link, or 1 1 nucleotide neighbors ) 
X15714 

Bacteriophage alpha3 deletion mutant DNA for origin region (-on) of replication 
gifl4767|emb|X157l4|BA3DMOR2 (14767] 

(View GenBank repon,FASTA report,ASN.l report,Graphical view.l IvflEDLINE link, or U nucleotide neighbors ) 
X15713 

Bacteriophage alpha3 deletion mutant DNA for the origin region (-ori) of replication 
gi|14766|emb|X15713|BA3DMORl (I4766J 

(View GenBank report,FASTA report,ASN.l report/jraphical view.l MEDLINE link, or 11 nucleotide neighbors ) 
X62059 

Bacteriophage alpha3 origin of cDNA synthesis (oriGA) 
gi|14763|emb|X62059|AL3ORIGA (14763) 

(View GenBank report^ AST A repooASN.l repart,Graphical view, I MEDLINE link, or 13 nucleotide neighbors ) 
X62058 

Bacteriophage alpha3 origin of cDNA synthesis (oriAA) 
gi|14762|emb|X62058|AL3ORlAA [14762] 

(View GenBank report,FASTA repor^ASN. 1 report,Graphical view, 1 MEDLINE link, or 13 nucleotide neighbors ) 
J02444 

Bacteriophage alpha3 origin of DNA replication 
gi|l66103|gb|J02444|AL3ORI [166103] 

(View GenBank report,FASTA repooASN.l report,Graphical view.l MEDLINE link, 2 protein links, or 12 nucleotide neighbors ) 
M25640 

Bacteriophage alpha-3 H protein gene, complete cds 
gi|166101|gb|M25640|AL3HP [166101] 

(View GenBank report^ASTA report^ SN.l report,Graphical view, I MEDLINE link, 1 protein link, or 13 nucleotide neighbors ) 
M10631 

Bacteriophage alpha-3 cleavage site for phage phi-Xl74 gene A protein 
gill66099|gb(M10631|AL3CSA [166099] 

(View GenBank report,FASTA rcportASN.l report,Graphical view,l MEDLINE link, I protein link, or 3 nucleotide neighbors ) 
X00774 

Bacteriophage alpha-3 gene i sequence 
gi|15431|emb{X00774[NCBA3J [15431] 

(View GenBank report,FASTA report^ASN.l report.Graphical view.l MEDLINE link, 3 protein links, or 2 nucleotide neighbors ) 
M25640 

Bacteriophage alpha-3 H protein gene, complete cds 
gi|l66101|gb|M25640|AL3HP [166101] 

(View GenBank report,FASTA rcportASN.l report,Graphical view.l MEDLINE link, 1 protein link, or 13 nucleotide neighbors ) 
M10631 

Bacteriophage alpha-3 cleavage site for phage phi-X174 gene A protein 
gijl66O99|gb|M10631|AL3CSA [166099] 

(View GenBank report,FASTA rcportASN.l report,Graphical view.l MEDLINE link, I protein link, or 3 nucleotide neighbors ) 
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J02459 

Bacteriophage lambda, complete genome 
gi|215t04|gb|J02459|LAMCG (215104) 

(View GenBank report,FA.STA report.ASKl repart,Grapbical view,87 MEDLINE links, 67 protein links, 190 nucleotide 
neighbors, or 1 genome link ) 

J02482 

Bacteriophage phi-Xl74, complete genome 
gi|2 1 60 1 9|gb|J02482|PX 1 CG [2 1 60 1 9] 

(View GenBank repon,FASTA report,ASN.l repon,Grapbical view,23*MEDLINE links, 1 1 protein links, 26 nucleotide neighbors, 
or 1 genome link ) 

J02454 

Bacteriophage G4, complete genome 
gi|2 1 541 5|gb|J02454|PG4CG [2 154 1 5 J 

(View GenBank report,FASTA repon,ASN.l report,Grapbical view,6 MEDLINE links, 1 1 protein links, 20 nucleotide neighbors, 
or 1 genome link ) 

X60323 

Bacteriophage phiK complete genome 
gi|l478H8l«nb|X60323|BPHIKCG [1478 118] 

(View GenBank report,FASTA report^SN.l report,Graphical view,10 protein links, 18 nucleotide neighbors, or 1 genome link ) 
L42820 

Bacteriophage BF23 tail protein (his) gene, complete cos 
gi|l048680jgb|L42820|BBFHRS (1048680] 

(View GenBank report,FASTA report^SN.I report,Graphical view.l MEDLINE link, 1 protein link, or 1 nucleotide neighbor ) 
X54455 

Bacteriophage BF23 gene 17 and gene 18 
giil4797|emb|X54455|BF231718G [14797] 

(View GenBank report,FASTA report^SN.l report,Graphica] view,2 protein links, or 2 nucleotide neighbors ) 
M37097 

Bacteriophage BF23 DNA, right end of terminal repetition 
gi|166U5|gb|M37097|BBFRIGH [166115] 

(View GenBank report JASTA report^SN.l report,Graphical view.l MEDLINE link, or 2 nucleotide neighbors ) 
M37096 

Bacteriophage BF23 DNA. left end of terminal repetition 
gi|166I14|gb{M37096!BBFLEFT [166114] 

(View GenBank report,FASTA reportKASN.l report,Grapbieal view.l MEDLINE link, or 1 nucleotide neighbor ) 
M37095 

Bacteriophage BF23 A2-A3 gene, complete cds, and Al gene, 5' end 
gi}1661 l0|gb|M37095iBBFA2A3 [166U0] 

(View GenBank report^ ASTA report^SN.l report,GraphicaI view,2 MEDLINE links, 3 protein links, or 1 nucleotide neighbor ) 
AF056281 

Bacteriophage BF23 clone bG3 jnac5/6.1, genomic survey sequence 

gq3090930|gbJAF056281|AF056281 (3090930] 

(View GenBank reporvFASTA reportKASN.l report, or Graphical view) 
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AF056280 

Bacteriophage BF23 clone bQ3.mac3, genomic survey sequence 

gi|3O90929|gb|AFO5628O|AFO56280 (3090929] 

(Vie*- GenBank report.FASTA report,ASN.i report, or Graphical view) 

AF056279 

Bacteriophage BF23 clone bf23.mac 18/2 1.34, genomic survey sequence 

gi!3090928|gb|AFO56279|AFO56279 {3090928] 

(View GenBank report.FASTA repooASN.l report, or Graphical view]* 

AF056278 

Bacteriophage BF23 clone bf23.mac 1671 9.33, genomic survey sequence 

gij3090927|gbjAF05 627 8|AF05 6278 [3090927] 

(View GenBank report.FASTA report,ASN.I report, or Graphical view) 

AF056277 

Bacteriophage BF23 clone bf23. mac 16/19-33, genomic survey sequence 

gij3O90926|gb|AFO56277|AFO56277 [3090926] 

(View GenBank report,FASTA repor^ASN.l report, or Graphical view) 

AF056276 

Bacteriophage BF23 clone bf23. mac 12/9-9, genomic survey sequence 

gi|3090925|gb|AF056276|AF056276 [3090925] 

(View GenBank ceport,FASTA reportASRl report, or Graphical view) 

AF056275 

Bacteriophage BF23 clone bf23.macl 1/14-24, genomic survey sequence 

gi|3090924|gb|AF056275|Af 056275 [3090924] 

(View GenBank report,FASTA report^SN.l report, or Graphical view) 

AF056274 

Bacteriophage BF23 clone bf23.57r64r, genomic survey sequence 
gi|3090923|gb(AF056274|AF056274 (3090923} 

(View GenBank repooFASTA repon^ASN.l report,Grapbicai view, or 3 nucleotide neighbors ) 
AF056273 

Bacteriophage BF23 clone bf23.54fr, genomic survey sequence 

gil3090922|gb|AF056273|AP056273 [3090922] 

(View GenBank rcpon,FASTA repooASN.l report, or Graphical view) 

AF056272 

Bacteriophage BF23 clone bf23.47n\mact0/7, genomic survey sequence 

gi|3090921|gbfAF056272|AF056272 [3090921] 

(View GenBank reportJASTA reporiASN. I report, or Graphical view) 

AF056271 

Bacteriophage BF23 clone bQ3.23.66r, genomic survey sequence 

gif3090920|gbfAF056271|AF056271 [3090920] 

(View GenBank report^ASTA report^SN.l report, or Graphical view) 

AF056270 

Bacteriophage BF23 clone bf23.23.64f, genomic survey sequence 

gil30909 1 9|gb|AF056270|AF056270 [30909 19] 

(View GenBank report,FASTA report^ASN.l report, or Graphical view) 
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Bacteriophage BF23 clone bf23.23.60r, genomic survey sequence 
gi|30909l8|gb|AFO56269|AFO56269 (3090918) 

(View GenBank report,FASTA report,ASN.l report, or Graphical view) 
AF056268 

Bacteriophage BF23 clone bf23.23.60f, genomic survey sequence 
gi|30909 1 7|gb|AFQ56268|AF056268 [30909 1 7] 

(View GenBank report.FASTA report, ASN. I rep on. Graphical view, or 1 nucleotide neighbor ) 
AF056267 

Bacteriophage BF23 clone bf23.23.59r, genomic survey sequence 
gi|30909 1 6|gb|AF056267|AF056267 [30909 1 6] 

(View GenBank report,FASTA report,ASN. 1 report, or Graphical view) 
AF056266 

Bacteriophage BF23 clone bf23.23.59f, genomic survey sequence 

gt|30909 1 5|gb|AF056266|AF056266 (30909 IS) 

(View GenBank report,FASTA report,ASN.l report, or Graphical view) 

AF056265 

Bacteriophage BF23 clone bf23.23.56r, genomic survey sequence 

gi|30909 1 4|gb| AF056265IAF056265 [30909 14) 

(View GenBank repon,FASTA repooASN.l report, or Graphical view) 

AF056264 

Bacteriophage BF23 clone bf23.23.56f, genomic survey sequence 

gi|3090913|gb|AF056264|AF056264 [3090913] 

(View GenBank report^ 7 AST A reporvASN.l report, or Graphical view) 

AF056263 

Bacteriophage BF23 clone bf23.23.68f35r, genomic survey sequence 
gi|3090912jgb|AF056263|AF056263 [3090912] 

(View GenBank repon,FASTA repooASN.l report, or Graphical view) 
AF036262 

Bacteriophage BF23 clone bf23.23.43fr.66f, genomic survey sequence 

gi|309091 l|gb|AF056262(AF056262 [30909U] 

(View GenBank report,FASTA reporvASN.l report, or Graphical view) 

AF056261 

Bacteriophage BF23 clone bC3.23.2fr, genomic survey sequence 

gil3090910}gb|AF0562611AF056261 [3090910] 

(View GenBank reportJASTA repooASN.l report, or Graphical view) 

AF05626O 

Bacteriophage BF23 clone bG3.23.55j; genomic survey sequence 

gil3090909jgbiAF056260JAF056260 (3090909) 

(View GenBank report^ASTA report^SN.l report, or Graphical view) 

AF056259 

Bacteriophage BF23 clone M23.23.53.r, genomic survey sequence 

gi|3090908|gb!AJF056259}AF056259 [3090908] 

(View GenBank report,FASTA report^SN.l report, or Graphical view) 
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AF056258 

Bacteriophage BF23 clone bf23. 23.53. f, genomic survey sequence 

gi|3O90907|gb|AFO56258!AFO56258 [3090907] 

(View GenBank report.FASTA repon.ASN. 1 report, or Graphical view) 

AF056257 

Bacteriophage BF23 clone bf23.23.52.r, genomic sun-ey sequence 
gi|3O9O906|gbf A.F05 625 7| AF0562S7 [3090906] 

(View GenBank report.FASTA report,ASN. 1 report, or Graphical view? 
AF056256 

Bacteriophage BF23 clone bf23.23.52.f, genomic survey sequence 
gi[3O90905|gb|AFO56256|AF056256 [3090905] 

(View GenBank report,FASTA report,ASN. 1 report, or Graphical view) 
AF056255 

Bacteriophage BF23 clone bf23.23.49.r, genomic survey sequence 

gi{3O9O904|gb|AF056255|AF056255 [3090904] 

(View GenBank report,FASTA report^ASN.l report, or Graphical view) 

AF056254 

Bacteriophage BF23 clone bf23.23.49.f, genomic survey sequence 

gi|3O9O903|gb|AFO56254|AF056254 [3090903] 

(View GenBank repooFASTA report^ASN. I report, or Graphical view) 

AF056253 

Bacteriophage BF23 clone bf23 .23.48^, genomic survey sequence 

gi|3O909O2|gb|AFO56253|AF056253 [3090902] 

(View GenBank report,FASTA report»ASN. 1 report, or Graphical view) 

AF056252 

Bacteriophage BF23 clone bG3^3.48.f, genomic survey sequence 

gi|3090901|gb|AF056252|AF056252 [3090901] 

(View GenBank report,FASTA reporvASN.l report, or Graphical view) 

AF056251 

Bacteriophage BF23 clone bf23.23.44.r, genomic survey sequence 

gi|3090900|gb|AF056251|AF05625I [3090900] 

(View GenBank reportJASTA report,ASN.l report, or Graphical view) 

AF056250 

Bacteriophage BF23 clone bf23.23.41.f, genomic survey sequence 

gi|3090899|gb|AF056250|AF056250 [3090899] 

(View GenBank report^ ASTA reportASN.l report, or Graphical view) 

AF056249 

Bacteriophage BF23 clone b£23.23.22.a.r, genomic survey sequence 

gi|3090898|gb|AF056249IAF056249 [3090898] 

(View GenBank report,FASTA report^lSN.l report, or Graphical view) 

AF056248 

Bacteriophage BF23 clone bf23.23.22.a.f, genomic survey sequence 

gi|3090897|gb!AF056248! Af 056248 [3090897] 

(View GenBank report^ ASTA repoOASN. I report, or Graphical view) 
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AF056247 

Bacteriophage BF23 clone bf23.23.68.r, genomic survey sequence 

gi|3090896|gb|AP056247|AF056247 [3090896] 

(View GenBank report.FASTA report t ASN.l report, or Graphical view) 

250U4 

Bacteriophage BF23 DNA for putative tail protein gene 
gi|2464952jemb|Z50l 14|BF23LATE (2464952] 

(View GenBank report.FASTA report,ASN. 1 repon,Graphical view, oM protein link ) 
Dt2824 

Bacteriophage BF23 genes for minor tail protein gp24 and major tail protein gp25, complete cds 
gi|520578|dbj|DI2824|BBF2TAlL[520578] 

(View GenBank report.FASTA report.ASN.1 report.Graphical view,l MEDLINE link, 2 protein links, or 3 nucleotide neighbors ) 
Z34953 

Bacteriophage K3 ip9, ip7 and ip8 genes 
gi|535261|emb|Z34953|MYK3IP978 [535261] 

(View GenBank report,FASTA report, ASN. I report,Graphical viewj MEDLINE link, 3 protein links, or 1 nucleotide neighbor ) 
235075 

Bacteriophage K3 DNA for Ip3 and Ip4 
gi|535229iemb|Z35075|MYEORF64K [535229} 

(View GenBank report,FASTA repoitASN.l rcport,Grapbical view,l MEDLINE link, or 2 protein links ) 
X05560 

Bacteriophage K3 gene 38 for receptor recognizing protein 
gi|15ll2}ernb|X05560|MYK3G38 [15112] 

(View GenBank report,FASTA reportASKl report,Graphical view.l MEDLINE link, or 1 protein link ) 
X04747 

Bacteriophage K3 gene 37 for receptor recognizing protein 
gi|15HOjemb|X04747}MYK3G37 [15110] 

(View GenBank reporcFASTA report^SRl report,Graphical view.l MEDLINE link, 1 protein link, or 2 nucleotide neighbors ) 
X01754 

Bacteriophage K3 tail fiber gene 36 
giJ15108|emb|X01754|MYK3F36 [15108] 

(View GenBank report,FASTA reportASN.l report, Graphical view.l MEDLINE link, or 2 protein links ) 
M16812 

Bacteriophage K3 't* lysis gene, complete cds 
gq215503|gb|MI6812{PK3LYST [215503] 

(View GenBank report,FASTA reportASN.l report,Graphical view,! MEDLINE link, 1 protein link, or 4 nucleotide neighbors ) 
L46833 

Bacteriophage K3 frd3, frd2 genes, complete cds 
gil951377|gb|L46833(PK3FRD32G [951377] 

(View GenBank report,FASTA reporttASN.l report, Graphical view,2 protein links, or 2 nucleotide neighbors ) 
L43613 

Bacteriophage K3 fibritin (wac) gene, complete cds 
gi|90386l{gb(L436l3(PK3WAC [903861] 

(View GenBank report,FASTA reporttASN.l report,Graphical view.l protein link, or 4 nucleotide neighbors ) 
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X01753 

Bacteriophage 0x2 tail fiber gene 36 

gi| 15 122|emb|X01753)MYOX2F36 [ 15 122] 

{View GenBank repon.FASTA rcport.ASN. 1 report,Graphical view.l MEDLINE link, 2 protein links, or 1 nucleotide neighbor ) 
L43612 

Bacteriophage 0x2 fibririn (wac) gene, complete cds 
gi|903848|gb|L436I2|OX2WAC [903848] 

(View GenBank repon.FASTA report,ASN. I repon,Graphical view.l p~rotein link, or 4 nucleotide neighbors ) 

Z46S80 
Bacteriophage 0X2 stp gene 
gi|599663|emb|Z46880(BPOX2STP (599663] 

(View GenBank repon.FASTA report,ASN.l repon,Graphical view.l MEDLINE link, 1 protein link, or 4 nucleotide neighbors ) 
X05675 

Bacteriophage 0x2 gene 38 for receptor-recognizing protein and flanking regions 
gi|15124|emb|X05675|MYOX2G38 [15124] 

(View GenBank report,FASTA report.ASN.1 report,Graphical view.l MEDLINE link, 3 protein links, or 1 nucleotide neighbor ) 
M33533 

Bacteriophage RB 1 8 translations! repressor protein (regA) and Orf43. 1 . complete cds 
gi|216083|gb|M33533|RB18REGA [216083] 

(View GenBank report,FASTA report^ASN.l report.Graphical view.l MEDLINE link, 2 protein links, or 2 nucleotide neighbors ) 
AF033329 

Bacteriophage RB 1 8 single-stranded binding protein (gene 32) gene, partial cds, and 5* region 
gi|2645788|gb|AF033329|AF033329 [2645788] 

(View GenBank rcport,FASTA report^ASN.I report,Grapbical view.l protein link, or 1 1 nucleotide neighbors ) 
M86231 

Bacteriophage RJB69 gene 62, 3'end; RegA (regA) gene, complete cds 
gi|215354|gb|M86231|P6962REGA [215354] 

(View GenBank report^FASTA repooASN.l rep on, Graphical view.l MEDLINE link, 2 protein links, or 1 nucleotide neighbor ) 
AF033332 

Bacteriophage RB69 single-stranded binding protein (gene 32) gene, partial cds, and 5' region 
gil2645794|gb|AF033332|AF033332 [2645794] 

(View GenBank repooFASTA report^SN.l repon,Graphical view.l protein link, or 12 nucleotide neighbors ) 
U34036 

Bacteriophage RB69 DNA polymerase (43) gene, complete cds 
gill237125|gb|U34036|BRU34036 [1237125] 

(View GenBank report^ASTA repooASN.l report,Grapbical view.l MEDLINE link, or 1 protein link ) 
V01 145 

Bacteriophage H 1 genome fiagxnent Each Thymine given in this sequence represents a HMU-reaidue 
(HMU - 5-hydroxymethyluracil) 
gi|15557|emb|V01145|PODOHl [15557] 

(View GenBank report,FASTA report^SN.l report, Graphical view, or 1 MEDLINE link ) 
X05676 

Bacteriophage Ml gene 38 for receptor recognizing protein and flanking regions 
gijl5114|ernb|X05676[MYMlG38[15114] 

(View GenBank rcportJASTA repor^ASN.l report,Graphical view.l MEDLINE link, 3 protein links, or 1 nucleotide neighbor ) 
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AF034575 

Bacteriophage Ml putative integrase (im) gene, complete cds, and attP region, complete sequence 
gi|2662472|gb|AF034575fAF034575 [2662472] 

(View GenBank report,FASTA report,ASN.l report,Gtaphical view.l MEDLINE link, or 1 protein link ) 
AF033321 

Bacteriophage Ml single-stranded binding protein (gene 32) gene, partial cds, and 5' region 
gil2645772)gb[AF033321|AF033321 (2645772) 

(View GenBank report,FASTA reporvASN.l report,Graphical view.l protein link, or 17 nucleotide neighbors ) 
X55190 

Bacteriophage Tula 37 and 38 genes for receptor-recognizine proteins 37 and 38 (respectively), partial cds 
gi| U860|emb|X55 1 90JBPTUIA [ 1 4860) 

(View GenBank report,FASTA report,ASN.t report,Graphtcal view, I MEDLINE link, 2 protein links, or 2 nucleotide neighbors ) 
AF033334 

Bacteriophage Tulb single-stranded binding protein (gene 32) gene, partial cds, and 5' region 
gi!2645798|gb|AF033334|AF033334 {2645798} 

(View GenBank report,FASTA report,ASN. 1 report,Graphical view, or 5 nucleotide neighbors ) 
X55191 

Bacteriophage Tulb 37 gene for receptor-recognizing protein 37 (partial cds), 38 gene for receptor-recognizing protein 38, 
and t gene (partial cds) 
gi|14863|emb|X55l91fBPTUIB [14863] 

(View GenBank report^ASTA reporv\SN.l report,Graphical view.l MEDLINE link, 3 protein links, or 3 nucleotide neighbors ) 
X13065 

Bacteriophage phiSO early region 
gi]!4800)emb|XI3065|BP80ER (14800) 

(View GenBank report,FASTA report^.SN-1 report,Graphical view,l MEDLINE link, 8 protein links, or 6 nucleotide neighbors ) 

D0O36O 
Bacteriophage phi80 cor gene 
gi)2 1 7782|dbj|D00360|P8080COR (2 17782] 

(View GenBank rcport,FASTA report,ASN.l report,Graphical view, or 1 protein link ) 
X01639 

Bacteriophage phi 80 DNA- fragment with replication origin 
gqi5828|emb(X01639|XXPHI80 (15828) 

(View GenBank repooFASTA repor^ASN. 1 repoxxGraphical view,l MEDLINE link, or 25 nucleotide neighbors ) 
X04051 

Lambdoid bacteriophage phi 80 int-xis region (integrase-excisionase region) 
gi|l5770|emb|X04051|STPH380Xri5770) 

(View GenBank repooFASTA reporvASNU report,Graphical view, I MEDLINE link, 2 protein links, or I nucleotide neighbor ) 
X06751 

Phage PhiSO DNA for major coat protein 
gi|l5768iemb|X06751|STPH180C (15768) 

(View GenBank reportJASTA repoivASN.l report,Grapbicat view, l MEDLINE link, 1 protein link, or 1 1 nucleotide neighbors ) 
X75949 

Bacteriophage phi80 DNA for ORF xI7l a and ORF xl71.28* 
gi|458811jernb|X75949|ECORF171B (458811) 

(View GenBank report^ AST A reporUASN.l report,Graphical view, I MEDLINE link, 2 protein links, or 28 nucleotide neighbors ) 
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140418 

Bacteriophage phi-80 gene, complete cds 
giUOl9l07|gbJL404!8|P80A(10l9t07] 

(View GenBank report.FASTA report.ASN.l report,Graphicai vjew.l MEDLINE link, or 1 protein link ) 
M2483I 

Bacteriophage phi-80 Tyr-cRNA gene, 3' end 
gi|2 1 5363|gb|M2483 1 IP80TGY [2 1 5363] 

(View GenBank report.FASTA reportASN.l repart,Gfapbical view, I MEDLINE link, or 43 nucleotide neighbors ) 
Ml 0670 

Bacteriophage phi-80 replication origin 
gi!215361|gb|M10670fP80ORI (215361) 

(View GenBank repon,f ASTA report^SN.l report,Graphica! view,l MEDLINE link, 1 protein link, or 1 nucleotide neighbor ) 
M24825 

Bacteriophage phi-80 RNA fragment 
gi|215360|gb|M24825)P80M3A {215360} 

(View GenBank repon,FASTA repoivASN.l report,Graphical view.l MEDLINE link, or 1 nucleotide neighbor ) 
MI19I9 

Bacteriophage phi-80 cl immunity region encoding the N gene 
gi|215358jgb|M119!9|P80CI [215358] 

(View GenBank report,FASTA report^ASN.l report,Grapbical vicw.l MEDLINE link, 1 protein link, or 2 nucleotide neighbors ) 
M10891 

Bacteriophage phi-80 attP site DNA 
gi(2l5357|gb|M1089I|P80ATTP [215357] 

(View GenBank report,FASTA rcporvASN.l report, Graphical view, 1 MEDLINE link, or 1 nucleotide neighbor ) 
M19473 

Bacteriophage 933 J (from E.coli) proviral Shiga-like toxin type 1 subunits A and B genes, complete cds 
gi|2 15072|gb|M19473JJ93SLTI [215072] 

(View GenBank report,FASTA report^SN.l report,Graphical view,2 MEDLINE links, 2 protein links, or 20 nucleotide neighbors ) 
Y10775 

Bacteriophage 933W ileX, stx2A and stx2B genes 
gi|1938206|emb|Y10775iBP933ILEX [1938206] 

(View GenBank report,FASTA reporvVSN.l report,Grapbical view,2 protein links, or 36 nucleotide neighbors ) 
X83722 

Bacteriophage 933 W slt-UB gene 

gil 1490229(emDjX83722lB933WSLT [1490229] 

(View GenBank report/ASTA report^SN.l report,Graphical view,2 protein links, or 20 nucleotide neighbors ) 
X07865 

Bacteriophage 933W slt-Il gene for Shiga-like toxin typeH subunit A and B 
gi| 14892|emb|X07865|BWSLTn [14892] 

(View GenBank report^ASTA reporg\5N.l report,Graphicai view,2 protein links, or 29 nucleotide neighbors ) 
Ml 6625 

Bacteriophage HI 9B (from E.coli) sltlA and sWB genes encoding Shiga-like toxin I subunits A and B, complete cds 
gii215043|gb|M16625|H19BSLT[2l5043] 

(View GenBank report,FASTA report^SN.l report,Grapbical view,l MEDLINE link, 2 protein links, or 24 nucleonde neighbors ) 
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Bacteriophage HI 9B shiga-like toxin- 1 (SLT-1) A and B subunit DNA, complete cds 
gi|215046|gb|M17358|H19BSLTA (215046] 

(View GenBank repon,FASTA report^ASN. t re port, Graphical view.l MEDLrNE link, 2 protein links, or 20 nucleotide neighbor; ) 
U29728 

Bacteriophage N4 single-stranded DNA-binding protein (N4SSB) gene, complete cds 
gi|939708|gb|U29728|BNU29728 (939708} 

(View GenBank reportXASTA repon,ASN. 1 report, Graphical view,2 MEDLINE links, or 1 protein link ) 
J02580 

Bacteriophage PA-2 (Exoli porcine strain isolate) Rz gene, 5'end; ORF2, outer membrane porin protein (Ic) and ORFl genes, 
complete cds 

gi|215366|gb|J02580|PA2LC [215366] 

(View GenBank report,FASTA report,ASN.l repon.Graphical view.l MEDLINE link, 4 protein links, or 4 nucleotide neighbors ) 
U32222 

Bacteriophage 186, complete sequence 
gi|3337249|gb|U32222!BlU32222 [3337249] 

(View GenBank repon,FASTA report.ASN.1 report,Graphical view,6 MEDLINE links, 46 protein links, or 5 nucleotide neighbors ) 
X51522 

Bacteriophage P4 complete DNA genome 
gi|450916|emb|X51522|MYP4CG [450916] 

(View GenBank report^ASTA report^SN.l report,Graphical view,3 MEDLINE links, 13 protein links. 6 nucleotide neighbors, 
or 1 genome link ) 

X92588 

Bacteriophage 82 orf33. orfl51, orf36, orf96, rus, orf45, and Q genes 
gi|10511 1 I|emb|X92588|BAC82HOLL [10511 1 1] 

(View GenBank repon,FASTA report^SN.l report,Graphical view,7 protein links, or 1 nucleotide neighbor ) 
J02803 

Bacteriophage 82 antitennination protein (Q) gene, complete cds 
gi|2l5364|gb{J02803|P82Q [215364] 

(View GenBank report,FASTA rcport^SN.l report,Graphical view.l MEDLINE link, or I protein link ) 
U02466 

Bacteriophage HK022 (cro), (cH) and (O) genes, complete cds, (P) gene, partial cds 
gq407285|gbfU02466|BHU02466 [407285] 

(View GenBank report,FASTA report^ASN.t report,Grapnical view.l MEDLINE link, 5 protein links, or 1 nucleotide neighbor ) 
M26291 

Bacteriophage D108 regulatory DNA-binding protein (ner) gene, complete cds 
gi|166194|gb|M2629l|D18NER [166194] 

(View GenBank reportfASTA repor^ASN.l report,Graphical view.l MEDLINE link, 1 protein link, or 1 nucleotide neighbor ) 
Ml 1272 

Bacteriophage D108 left-end DNA 
gi|166I93|gb|M11272(D18LEDNA [166193] 

(View GenBank repoaFASTA report^SN. 1 report,Graphical view.l MEDLINE link, or 2 nucleotide neighbors ) 
Ml 8902 

Bacteriophage D 108 kil gene encoding a replication protein, 3' end; and containing three ORFs, complete cds 
giJ16619i|gb|M18902|D18KIL [166191] 

(View GenBank rcport,FASTA report^SN. 1 report,Graphical view, 1 MEDLINE link, 1 protein link, or 3 nucleotide neighbors ) 
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M10191 

Bacteriophage D108, left end with Mu A protein binding sites LI and L2 
gij 166 1 90|gb|M 1 0 1 9 1 |D 1 8BSL f 1 66 1 90] 

(View GenBank report,FASTA report,ASN.l report.Graphical view, I iMEDLINE link, or 5 nucleotide neighbors ) 
J02447 

bacteriophage d 108 gene a 5' end 

gi| 1 66 189|gb|J02447|D 1 8 AAA [1661 89J 

(View GenBank report.FASTA report,ASN. 1 report, Graphical view, oVl MEDLINE link ) 
V00865 

Bacteriophage D108 fragment from genes A and ner (C'terminus of ner and N-terminus of A) 
gi|15437|emb|V00865|NCD 108 [15437] 

(View GenBank report.FASTA report,ASN.l repon,Graphical view.l MEDLINE link, or 2 protein links ) 
X01914 

Bacteriophage IKc gene for DNA binding protein 
gi|l4957|emb|X01914|INIK£DBP [14957] 

(View GenBank report.FASTA repo«*ASN.l iepon,Graphical view.l MEDLINE link, 1 protein link, or 2 nucleotide neighbors ) 

AF064539 
Bacteriophage N 15, complete genome 
gi|3192683|gb(AF064539|AF064539 [3192683] 

(View GenBank report,FASTA repor%ASN. 1 report,Graphical view,2 MEDLINE links, 60 protein links, 26 nucleotide neighbors, 
or 1 genome link ) 

U02303 

Bacteriophage Ifl. complete genome 
gi(36762801gb(U02303|B2UO2303 [3676280] 

(View GenBank report,FASTA repon^SN.l repon,Graphical view, 10 protein links, or 1 genome link ) 
AF007792 

Bacteriophage Mu late morphogenetic region 
gi|3551775|gb|AF007792|AF007792 (3551775] 

(View GenBank reportJASTA reporvASN. i report,Graphical view, or 1 nucleotide neighbor ) 
U24159 

Bacteriophage HP1 strain HPlcl, complete genome 
gil 1046235|gb|U24 1 59|BHU24 159 [1046235] 

(View GenBank reportJASTA report^SN.l report,Graphica] view,6 MEDLINE links. 41 protein links, 8 nucleotide neighbor*, 
or 1 genome link ) 

Z7I579 

Bacteriophage S2 type A 5.6 kb DNA fragment 
gqi679806fcmb|Z71579JBPHSlADNA [1679806] 

(View GenBank repoOFASTA report^SN.l report,Grapbical view,3 MEDLINE links, 9 protein links, or 9 nucleotide neighbors ) 
X53238 

Klebsiella sp. bacteriophage Kl 1 gene 1 for RNA polymerase 
gi|149841emblX53238|KSKHRPO [14984] 

(View GenBank report^ASTA report^SN.l report,Graphieal view.l MEDLINE link, 1 protein link, or I nucleotide neighbor ) 
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X850I0 

Bacteriophage A5 1 1 ply5 ! 1 gene 
gi|853748|emb!X850I0|BPA5UPLY [853748] 

(View GenBank report.FASTA report,ASN.l report.Graphical vieu-,1 MEDLINE link. 3 protein links, or 1 nucleotide neighbor ) 
U29728 

Bacteriophage N4 single-stranded DNA-binding protein (N4SSB) eene, complete cds 
gi|939708|gb|U29728lBNU29728 (939708) 

(View GenBank repon,FASTA report,ASN. 1 reporr,Grapbjcal view,2 MEDLINE links, or I protein link ) 
J02445 

bacteriophage bol 3'-terminal region ma 
gi}166l52|gb|J02445|BOlTR3 (166152) 

(View GenBank report,FASTA report,ASN.l repon,Graphical view.l MEDLINE link, or 5 nucleotide neighbors ) 
L06183 

Bacteriophage L5 (from Leuconostoc oenos) genome 
gi!289353|gb|L06183|BLSGENM (289353) 

(View GenBank reportJASTA repor^ASN.l report,Graphical view, or 1 genome link ) 
AF074945 

Mycoplasma arthritidis bacteriophage MA VI, complete genome 
gi|35 1 1243|gb1AF074945lAF074945 (35 1 1243) 

(View GenBank report,FASTA report^ASN.l report, Graphical view, 15 protein links, 3 nucleotide neighbors, or 1 genome link ) 
L13696 

Bacteriophage L2 (from Mycoplasma), complete genome 
gi|289338|gblL13696lBUCG (289338) 

(View GenBank rcport,FASTA repooASN.l report,Grapbical view,3 MEDLINE links, 14 protein links, or 1 genome link ) 
X80191 

Bacteriophage PP7 mRNA for maturation, coat, lysis and repticase proteins 
gi|517237|ernb|X80191)BPP7PR (517237) 

(View GenBank report^ ASTA report^SN.l report,Grapbical view, 1 MEDLINE link, 4 protein links, or 1 genome link ) 
Mt9377 

Bacteriophage Pf3 from Pseudomonas aeruginosa (New York strain), complete genome 
gi|215380|gb|M19377|PF3COMNY (215380) 

(View GenBank report^ASTA repor^ASN. 1 report, Graphical view.l MEDLINE link, 9 protein links, or 5 nucleotide neighbors ) 
M11912 

Bacteriophage PD from Pseudomonas aeruginosa (Nijrhegen strain), complete genome 
gi)2153711gb|Mn912|PF3COMN t2I5371] 

(View GenBank reportjASTA rcport^SN. I report,GraphicaI view, 1 MEDLINE link, 9 protein links, 5 nucleotide neighbors, or 1 
genome link ) 

V00605 

Bacteriophage Pfi gene encoding DNA binding protein 
gi|14970|emb|V00605|INOPFl (14970) 

(View GenBank report,FASTA report^ASN.l reporVGraphical view.l proteine link, or I nucleotide neighbor ) 
L05626 

Bacteriophage PR4 capsid protein (P6) gene, complete cds 
gi|2l5735|gb|L05626|PR4P6MAJA [215735] 

(View GenBank report,FASTA report^ASN.l report,Graphical view.l MEDLINE link, 1 protein link, or I nucleotide neighbor ) 
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D 13409 

Bacteriophage phiCTX (isolated from Pseudomonas aeruginosa) cosR, attP, int geaes 
gi(2t7776|dbj|Di3409|BPHCOSR [217776] 

(View GenBank report^FASTA report,ASN. 1 report,Grapbical view,! MEDLINE link, 3 protein links, or 3 nucleotide neighbors ) 
DU408 

Bacteriophage phiCTX (isolated from Pseudomonas aeruginosa) cosL, ctx genes 
g i|2i7775!dbj|Dl3408iBPHCOSLCTX [217775] 

(View GenBznk report,FASTA report.ASN. 1 report,Graphical vie\v,2 $EDLINE links, or 3 nucleotide neighbors ) 
M24832 

Bacteriophage f2 coat protein gene, partial cds 
gi|l66228|gb|M24832|F2CRNACA [166228] 

(View Genflank rcport.FASTA repoaASN.l repon,Graphical view.l MEDLINE link, I protein link, or 4 nucleotide neighbors > 
S720U 

Bacteriophage 21 isocitrate dehydrogenase (icd) and integrase (int) geoes,paraal cds 
gi|26l8967|gb(AF0l7629JAF0l7629 [2618967] 

(View GenBank report,FASTA report^SN. I report,Grapbical view.l MEDLINElink, 2 protein links, or 44 nucleotide neighbors ) 
AFO 17628 

Bacteriophage 21 isocitrate dehydrogenase (icd) and integrase (int) geaes, partial cds 
gi|26l8964|gb[AF0l7628|AF0l7628 [2618964] 

(View GenBank report,FASTA report,ASN.l report,Graphical view.l MEDLINElink, 2 protein links, or 44 nucleotide neighbors ) 
AFO 17627 

Bacteriophage 2 1 isocitrate dehydrogenase (icd) and integrase (int) genes, partial cds 
gi|26 1896 KgblAFO 1 7627|AF017627 [2618961] 

(View GenBank report,FASTA report*ASN.l report,Grapbical view.l MEDLINElink, 2 protein links, or 44 nucleotide neighbors ) 
AF017626 

Bacteriophage 21 isocitrate dehydrogenase (icd) gene, partial cds; and integrase (int) gene, partial cds 
gi|26l8958|gb|AF0i?626fAF017626 [2618958] 

(View GenBank report,FASTA reportASN.l report,Grapbical view.l MEDLINE link, 2 protein links, or 49 nucleotide neighbors ) 
AF017625 

Bacteriophage 21 isocitrate dehydrogenase (icd) and integrase (int) genes, partial cds 
giJ26l8955|gb|AF017625|AF017625 (2618955) 

(View GenBank report,FASTA reportASN.l report,Grapbical view.l MEDLINElink, 2 protein links, or 44 nucleotide neighbors ) 
AF017624 

Bacteriophage 21 isocitrate dehydrogenase (icd) and integrase ( int) g rocs, partial cds 
gi|26l8952|gb|AF0l7624|AF017624 [2618952] 

(View GenBank report^ASTA rcpott^SN.l report.Graphical view.l MEDLINE link, 2 protein links, or 44 nucleotide neighbors ) 
AFO 17623 

Bacteriophage 21 isocitrate dehydrogenase (icd) and integrase (int) genes, partial cds 
gi|26l8949jgb|AF017623|AF017623 [2618949] 

(View GenBank repon,FASTA report^SN.l report,Graphical view.l MEDLINE link, 2 protein links, or 44 nucleotide neighbors ) 
AFO 17622 

Bacteriophage 21 isocitrate dehydrogenase (icd) and integrase (int) genes, partial cds 
gi[26l 8946igb|AF0l7622|AF01 7622 (261 8946] 

(View GenBank report,FASTA report, ASN*. I repott,Graphical view.l MEDLINE link, 2 protein links, or 44 nucleotide neighbors ) 
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AF017621 

Bacteriophage 2 1 isocitrate dehydrogenase (icd) and integrase (int) genes, partial cds 
gi|26l8943|gb|AF017621|AF01?62l [2618943] 

(View GenBank report.FASTA report,ASN.l report.Grapfaicai view.l MEDLrNE link, 2 protein links, or 44 nucleotide neighbors ) 
D26449 

Bacteriophage PS 1 7 FI gene for tail sheath protein (gpFlj and FII gene for tail tube protein (gpFII), complete cds 
gi|452162|dbj|D26449|BPSFTFII [452162) 

(View GcnBank report,FASTA Teport^ASN. 1 report,Graphical view, or 2 protein links ) 
X87627 

Bacteriophage D3 1 12 A and B genes 
gi|974768lemb|X87627|BPD31 12AB [974768] 

(View GenBank repon.FASTA report,ASN.l report,Graphical view.l MEDLINElink, 2 protein links, or 1 nucleotide neighbor ) 
U32623 

Bacteriophage D3 transcriptional activator CIT (ell) gene, complete cds 
gi|984852|gb|U32623|BDU32623 [984852] 

(View GenBank repon,FASTA report, ASN.l report,Grapbical view.l protein link, or 1 nucleotide neighbor ) 
L3478I 

Bacteriophage phi 1 1 holin homologue (ORf 3) gene, complete cds and peptidoglycaa hydrolase (lytA) gene, partial cds 
gt|5 1 1 83 8|gb|L3478 1 (BPHHOUN [5 1 1 838} 

(View GenBank report,FASTA report^ASN.l report,Graphical view.l MEDLINE link, 4 protein links, or 2 nucleotide neighbors ) 
L 148 10 

Bacteriophage P22 (gplO) gene, complete cds, and (gp26) gene, complete cds 
gi|294053|gb|L14810|P22GP1026X [294053J 

(View GenBank report^ASTA report^SN.l report,Graphical view.l MEDLINE link, 2 protein links, or 2 nucleotide neighbors ) 
X87420 

Bacteriophage ESI 8 genes 24, c2, cro, cl, 18, and oL and oR operators 
gi|l 143407|emb|X87420{BPES18GEN (1 143407] 

(View GenBank report,FASTA repor^ASN. I report,Graphical view,5 protein links, or 9 nucleotide neighbors ) 
L42820 

Bacteriophage BF23 tail protein (hrs) gene, complete cds 
gi|1048680|gb|L42820|BBFHRS [1048680] 

(View GenBank report,FASTA reportpASN. 1 report, Graphical view, 1 MEDLINE link, 1 protein link, or 1 nucleotide neighbor ) 
X14980 

Bacteriophage PRD1 XV gene for protein P 15 (lytic enzyme) 
gi|158021emb|Xl4980rrEPRDlXV [15802] 

(View GenBank report^ ASTA report^SK I reportGraphical view, 1 MEDLINElink, 1 protein link, or 4 nucleotide neighbors ) 
X06321 

Bacteriophage PRD1 gene 8 for DNA terminal protein 
gillS80Q|emb|X06321[TEPRD18 [15800] 

(View GenBank report,FASTA report^SN.l report,Grapbical view.l MEDLINE link, 2 protein links, or 10 nucleotide neighbors ) 
X14336 

Filamentous Bacteriophage 12-2 genome 
gi|\4920|emb|X14336)INBI22 [14920] 

(View GenBank report,FASTA reportASN.t report,Grapbical view.l MEDLINE link, 9 protein links, 1 nucleotide neighbor, or 1 
genome link ) 
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Bacteriophage X glucosyl transferase gene, complete cds 
gi)216(M4|gb|L0500)|PXFCLUSYLT [216044] 

(View GenBank report,FASTA repon,ASN. 1 report,Graphica! view, 1 MEDLINE link, or 1 protein link ) 
M29479 

Bacteriophage p4 sid and psu genes partial cds, and delta gene, complete cds gi|2 15701) 
gb|M29479|PP4SDP [215701] 

(View GenBank repon.FASTA report,ASN.l report,Graphical view.3 protein links, or 4 nucleotide neighbors ) 

SEG_PP4PSUSID 
Bacteriophage P4 capsid size determination protein (sid) gene, 5* end 
gij2lS698|gblISEGJ>P4PSUSID [215698] 

(View GenBank report,FASTA report,ASN.l report,Graphical view, I MEDLINE link, 2 protein links, or 1 nucleotide neighbor ) 
M29650 

Bacteriophage P4 polarity suppression protein (psu) gene, complete cds 

gi|215697|gb|M29650|PP4PSUSID2 [215697] 

(View GenBank report,FASTA report^SN. 1 report, or Graphical view) 

M29651 

Bacteriophage P4 capsid size determination protein (sid) gene, 5' end 

gi|2 15696|gb|M2965 l|PP4PSUSIDl [215696] 

(View GenBank repon,FASTA report^ASN. 1 report, or Graphical view) 

M27748 

Bacteriophage P4 gop, beta, and ell genes, complete cds and int gene, 3' end 
gi|2l5691jgb|M27748|PP4GOPBC [215691] 

(View GenBank repooFASTA repoit^SN. 1 report,Graphical view,! MEDLINE link, 4 protein links, or 1 nucleotide neighbor ) 
K027SO 

Bacteriophage IKe, complete genome 
gil2l5061|gb|K02750iIKECG [215061] 

(View GenBank report^ASTA report^SN.l report,Graphical view, I MEDLINE link, 10 protein links, 4 nucleotide neighbors, or I 
genome link ) 

L40418 

Bacteriophage phi-80 gene, complete cds 
gi|1019107|gb|L40418[P80A (1019107] 

(View GenBank report^ASTA report^ASN.l report,Graphical view,l MEDLINE link, or I protein link ) 
AF032122 

Bacteriophage Sffl integrase (int) gene, panial cds; andbactoprenol glucosyl transferase (bgt), and glucosyl tranferase II (gtrll) 
genes,complete cds 

gi|2465412|gb|AF021347)AF021347 [2465412] 

(View GenBank report^ ASIA rtpoiVkSN.l report,Graphical view,l MEDUNElink, 4 protein links, or 2 nucleotide neighbors ) 
M35825 

Bacteriophage SF6 fragment D lysozyme gene, complete cds 
gi|216l05|gb|M35825|SF6LY2 [216105] 

(View GenBank report,FASTA report^ASN.l rcporuGraphicai view, or 1 protein link ) 

Z35479 
Bacteriophage CI 6 rpl gene 
gi|534936!emb|Z35479|BC16IPI [534936] 

(View GenBank report^ASTA repotWSK.l report, Graphical view, 1 MEDLINE link, I protein link, or 2 nucleotide neighbors ) 
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XI2638 

Bacteriophage 2 1 DNA for gene 2 
gi|29614l|emb|X12638|B21GENE2 [296141] 

(View GenBank report,FASTA report.ASN.l report,Graphical view.l MEDLINE link, 1 protein link, or 1 nucleotide neighbor ) 
X02501 

Bacteriophage 2 1 DNA for left end sequence with genes 1 and 2 
gi!15825|emb|X02501IXXPHA2l [15825] 

(View GenBank report,FASTA report,ASN. 1 report, Graphical view, 1 MEDLINE link, 2 protein links, or 3 nucleotide neighbors > 
M65239 

Bacteriophage 2 1 lysis genes S, R, and Rz, complete cds 
gi|215466|gb|M65239|PH2LYSGEN (215466] 

(View GenBank report,FASTA report.ASN.1 report,Graphicat view, 1 MEDLINE link, 3 protein links, or 1 nucleotide neighbor ) 
M58702 

Bacteriophage 2 1 late gene regulatory region 
gi]2 1 5 465|gbjM58702 IPH2LATEGE [2 1 5465] 

(View GenBank report,FASTA reporvASN.l rcport,Graphical view, or 1 MEDLINE link ) 
M81255 

Bacteriophage 2 1 head gene operon 
gi|215454|gb|M81255|PH2HEADTL [215454] 

(View GenBank repoaFASTA reporvASN.l report,Graphical view,2 MEDLINE links, 10 protein links, or 4 nucleotide neighbors ) 
M23775 

Bacteriophage 2 1 glycoprotein I gene, complete cds, and glycoprotein gene, 5' end 
gi|215451|gb|M23775|PH2GPA [215451] 

(View GenBank report,FASTA repooASN.l report,Graphical view.l MEDLINE link, 2 protein links, or 3 nucleotide neighbors ) 
M61865 

Bacteriophage 21 excisionase (xis), integrase (int) and isocitrate dehydrogenase (icd), complete cds 
gi|215448|gb|M61865|PH22XISAA [215448] 

(View GenBank reportJASTA report^SN.l report, Graphical view,2 protein links, or 9 nucleotide neighbors ) 
S7201I 

Bacteriophage 2 1 isocitrate dehydrogenase (icd) and integrase (int) genes, partial cds 
gi|26l8967|gb|AFOI7629|AF0I7629 [2618967] 

(View GenBank reportJASTA report^SN.l report,Graphical view.l MEDLINE link, 2 protein links, or 44 nucleotide neighbors ) 
AF017628 

Bacteriophage 2 1 isocitrate dehydrogenase (icd) and integrase (int) genes, partial cds 
giJ2618964(gb|AF017628|AF0I7628 [2618964] 

(View GenBank report^ ASTA report.ASN.l report,Graphical view, I MEDLINE link, 2 protein links, or 44 nucleotide neighbors ) 
AF017627 

Bacteriophage 21 isocitrate dehydrogenase (icd) and integrase (int) genes, partial cds 
gi|26l8961|gb|AF017627!AF017627 [2618961] 

(View GenBank report,FASTA report^SN.l report,Grapbical view,l MEDLINE link, 2 protein links, or 44 nucleotide neighbors ) 
AF017626 

Bacteriophage 21 isocitrate dehydrogenase (icd) gene, partial cds; and integrase (int) gene, partial cds 
gi|2618958|gb|AP017626jAF017626 [2618958] 

(View GenBank report,FASTA reporUSN.l report,Graphical view.l MEDLINE link, 2 protein links, or 49 nucleoudc neighbors ) 
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AFO 1 7625 242 
Bacteriophage 21 isocinate dehydrogenase (icd) and integrate (iat) genes, panial cds 
gi|26l8955|gb|AF0l7625|AF017625 {2618955} 

(View GenBank report,FASTA report,ASN- 1 repon,Graphical view,l MEDLINE link, 2 protein links, or 44 nucleotide neighbors 
AFO 17624 

Bacteriophage 21 isocitrate dehydrogenase (icd) and integrase (int) genes, partial cds 
gi|2618952|gb|AF017624|AF017624 [2618952] 

(View GenBank report,FASTA report,ASN.l report,Graphical view.l MEDLINE link, 2 protein links, or 44 nucleotide neighbors ) 
AFO 17623 

Bacteriophage 2 1 isocitrate dehydrogenase (icd) and integrase (int) genes, partial cds 
gi|26 18949}gb| AFO 17623|AF017623 [2618949] 

(View GenBank report,FASTA report^ASN.! report^raphicai view,i MEDLINE link, 2 protein links, or 44 nucleotide neighbors ) 
AF0I7622 

Bacteriophage 21 isocitrate dehydrogenase (icd) and integrase (int) genes, partial cds 
gt|26l8946|gbtAFQl7622|AF0l7622 [2618946] 

(View GenBank report,FASTA report^ASN, I report,Graphical view, I MEDLINE link. 2 protein links, or 44 nucleotide neighbors > 
AF017621 - 

Bacteriophage 2 1 isocitrate dehydrogenase (icd) and integrase (int) genes, partial cds 
gi|26 1 8943|gb|AF0 J 762 1 1 AFO 1 762 1 (2618943] 

(View GenBank report,FASTA report^ASN.l report,Graphical view.l MEDLINE link, 2 protein links, or 44 nucleotide neighbors ) 
M57455 

Bacteriophage 42D (clone pDB17) (from Staphylococcus aureus) staphylokinase gene, complete cds 
gi|2l5344|gb|M57455|P42STK [215344] 

(View GenBank report,FASTA report^SN.l report,Graphical view,l protein link, or 9 nucleotide neighbors ) 
Y12633 

Bacteriophage 85 DNA, promoter sequence of unknown gene 

gi|2058285|emb|Yl2633|B85PROM (2058285] 

(View GenBank report,FASTA reporvASN.l report, or Graphical view) 

X98146 

Bacteriophage PI DNA sequence around the Op88 operator 
gi|13595I3|emb|X98146|BP10P880P [1359513] 

(View GenBank report,FASTA report^ASN.l repon.Graphical view, or 1 nucleotide neighbor ) 
Y07739 

Staphylococcus phage Twort holTW, plyTW genes 
gi|2764979|emb|Y07739|BPTWGHOLG [2764979] 

(View GenBank reportFASTA repoiVASN.l report, Graphical view, or 2 protein links ) 
L07580 

Bacteriophage phi- 1 1 rinA and rin B genes, required for the activation of Staphylococcal phage phM 1 int expression 
gi|1661601gb|L07580jBPHIUNAB [166160] 

(View GenBank reportFASTA reporvASN.l report,Grapbical view.l MEDLINE link, or 2 protein links) 
M34832 

Bacteriophage phi- 1 1 integrase (int) and excisionase (xis) genes, complete cds 
g qi66157lgb|M34832(BPHINTXIS [166157] 

(View GenBank report,? ASTA repon^SN.l report,Graphical view,l MEDLINE link, 2 protein links, or 2 nucleotide neighbors ) 
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M20394 

Bacteriophage phi- 1 1 S, aureus attachment site (attP) 
gil 1 66 1 56|gb|M20394|BPHATTP [ 1 66 1 56 J 

(View GenBank repon.FASTA report,ASN.l repon.Graphicai view, 1 MEDLINE link, or 4 nucleotide neighbors ) 
X23128 

Bacteriophage phi- 13 integrase gene 
gi|758228|emb|X823 12|PHI13INT [758228] 

(View GenBank report,FASTA report,ASN. 1 repon.Graphicai view, 1 protein link, or 3 nucleotide neighbors ) 
X617I9 

S. aureus phi- 13 lysogen right chromosome/bacteriophage DNA junction 
gi|46625femb|X617J9|SAPl3RJNC[46625] 

(Vjew GenBank report,FASTA repot^ASN. 1 repon, Graphical view, or 1 MEDLrNE link ) 
X61718 

5.aureus phi- 13 lysogen left chromosomal/bacteriophage DNA junction 
gi|46624|emb|X61718|SAP13IJNC (46624] 

(View GenBank report,FASTA report,ASN. 1 report,Graphical view, or 1 MEDLINE link ) 
X617I7 

Bacteriophage phi- 13 core sequence for attachment 
ggi4799|emb|X61717|BPi3ATTP [14799] 

(View GenBank report,FASTA repooASN. 1 report,Graphical view,2 MEDLINE links, or 3 nucleotide neighbors ) 
U0I875 

Bacteriophage phi- 1 3 putative regulatatory region and integrase (int) gene, partial cds 
gi|437 11 8|gb|U0 1 875IU0 1 875 [437 1 1 8] 

(View GeoBanJc report,FASTA report,A$N. 1 report,Graphical view,3 MEDLINE links, or 4 nucleotide neighbors ) 



X67739 

S.aureus Bacteriophage phi-42 attP gene 
gi|14809|etnb|X67739|BPATTPA [14809] 

(View GenBank reponJASTA repor^ASN.l report,Graphical view.l MEDLINE link, or 3 nucleotide neighbors ) 
U0I872 

Bacteriophage phi-42 integrase (int) gene, complete cds 
gi|437H5|gbfU01872[U01872 [437115] 

(View GenBank report^ASTA report^SN.l report, Graphical view,3 MEDLINE links, 2 protein links, or 3 nucleotide neighbors ) 
X94423 

Staphylococcus aureus bacteriophage phi-42 DNA with ORFs (restriction modification system) 
gi|177!597|emb|X94423|SARMS [1771597] 

(View GenBank report,FASTA reporWSN. 1 report,Graphical view,2 protein links, or 1 nucleotide neighbor ) 
M27965 

Bacteriophage L54a (from S.aureus) int and xis genes, complete cds 
gi|2 1 5096|gb|M279651L54INTXIS [2 15096] 

(View GenBank repon,FASTA reporvASN.l report, Graphical view, MEDLINE 1 link, 2 protein links, or 3 nucleotide neighbors ) 
U72397 

Bacteriophage 80 alpha hoiin and amidase genes, complete cds 
gi| 176324 l|gb|U72397|B8U72397 [1763241] 

(View GenBank report,FASTA report^SN.l report, Graphical view,2 protein links, or 2 nucleotide neighbors ) 
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AB009866 

Bacteriophage phi PVL pioviial DNA, complete sequence 
gi[3341907[dbj|AB009866|AB009866 [3341907] 

(View GenBank report.FASTA report,ASN.l rcport,Graphical view,63 protein links, or 1 nucleotide neighbor ) 
Z47794 

Bacteriophage Cp-1 DNA, complete genome 
gi|2288892|emb|Z47794|BPCPlXX [2288892] 

(View GenBank report,FASTA report.ASN.1 report,Graphical view,3 MEDLINE links, 28 protein links, 1 nucleotide neighbor, or 
1 genome link ) 

SEG_CP7RSIT 

Bacteriophage Cp-7 (S.pneumoniae) 5' inverted terminal repeat 
gi|l 66 1 86|gb|ISEG_CP7RSIT [1661 86] 

(View GenBank report,FASTA reportASN.l report,Graphical view, or 1 MEDLINE link ) 
Ml 1635 

Bacteriophage Cp-7 (S.pneumoniae) DNA, 3' inverted terminal repeat 
gi|166t85|gb|Ml I635|CP7RSIT2 [166185] 

(View GenBank report,FASTA report^SN.l report, or Graphical view) 
MU636 

Bacteriophage Cp-7 (S.pneumoniae) 5' inverted terminal repeat 
gi}166184|gb|Ml 1636|CP7RSIT1 [166184] 

(View GenBank report,FASTA report^SN.l report, or Graphical view) 

SEG_CP5RSIT 
Bacteriophage Cp-5 (S.pneumoniae), 5' inverted terminal repeat 
gi|166I81|gb|!SEG_CP5RSIT [166181] 

(View GenBank report,FASTA report^SN. 1 report,Graphical view, or 1 MEDLINE link ) 
M 11633 

Bacteriophage Cp-5 (S.pneumoniae) 3' inverted terminal repeat 
gi|166180|gb|MU633|CP5RSIT2 [166180] 

(View GenBank report,FASTA report^SN.l report, ot Graphical view) 
Ml 1634 

Bacteriophage Cp-5 (S.pneumoniae), f inverted terminal repeat 
gi|166179|gb|M11634|CP5RSrri [166179] 

(View GenBank report^ ASTA report^SN.l report, or Graphical view) 
M34780 

Bacteriophage Cp-9 muramidase (cpl9) gene 
gi|166187|gb|M34780|CP9CPL [166187] 

(View GenBank report^ASTA rcport^SN.l report,Graphical view,l MEDLINE link, i protein link, or 1 nucleotide neighbor ) 
M34652 

Bacteriophage HB-3 amidase (hbl) gene, complete cds 
gi|21 5055|gb|M34652IHB3HBLA [215055] 

(View GenBank report^ASTA reporvASN.l report,Graphical view.l MEDLINE link, or 1 protein link ) 
U64984 

Streptococcus pyogenes phage T12 repressor, excisionase (xis), integrase(int) and erythrogenic toxin A precursor (speA) genes, 
complete cds gi|1877426|gblU40453|SPU40453 [1877426] 

(View GenBank report,FASTAreport>SN.l report,Graphical view,2 MEDLINE links. 4 protein links, or 22 nucleotide neighbors ) 
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X12375 

Phage CP-Tl (Vibrio cholerae) DNA for packaging signal (pac site) 
gi|I5435|emb|X12375|NCCPPAC [15435] 

(View GenBank repoaFASTA report,ASN.l report,Graphical view.l MEDLINE link, or 1 protein link ) 
AF087814 

Vibrio cholerae filamentous bacteriophage fs-2 DNA, complete genome sequence 
gi|37O2207|dbj|ABOO2632|AB0O2632 [3702207] 

(View GenBank report,FASTA report,ASN.l report,Grapbical view.l MEDLINE link, 9 protein links, or 1 genome link ) 
D83518 

Bacteriophage KVP40 gene for major capsid protein precursor, complete cds 
gi|3046858|dbj|D835 1 8JD835 1 8 [3046858] 

(View GenBank report.FASTA repon,ASN.l ieport,Graphical view.l MEDLINE link, or i protein link ) 
AF033322 

Bacteriophage PST single-stranded binding protein (gene 32) gene, partial cds, and 5' region 
gi|2645774|gb|Af 033322|AF033322 (2645774) 

(View GenBank report,FASTA repoct,ASN.l report,Grapbical view,! protein link, or 17 nucleotide neighbors ) 
X9433I 

Bacteriophage L cro, 24, c2, and cl genes 

gi| 14692 1 3|cmb|X9433 1 (BLCR024C [ 14692 13) 

(View GenBank report^ASTA report^SN. 1 report,Grapbical view,l MEDLINE link, or 4 protein links ) 
U82619 

Shigella flexneri bacteriophage V glucosyl transferase (gtr), integrase (mt) and excisionase (xis) genes, complete cds 
gi|2465470|gb|U826l9|SFU826l9[2465470J 

(View GenBank reportJFASTA repoit^SN.l report,Graphical view,l MEDLINE link, 8 protein links, or 1 nucleotide neighbor ) 



WO 00/32825 



PCT/IB99/02040 



246 
table 12 



NCBI Entrez Nucleotide QUERY 
Key words: bacteriophage and lysis 
56 citations found (all selected) 



AJ011581 

Bacteriophage PS119 lysis genes 13, 19, 15. and packaging gene 3. 
complete cos 

gil3676O84JemblAJ011581!BPS0ll581 [3676084] 

(View GenBank report J-ASTA report.ASN.1 report,GraphicaJ view,4 protein 
links, or 1 nucleotide neighbor ) 

AJ011580 

Bacteriophage PS34 lysis genes 13. 19, 15. anU terminator gene 23, and 
packaging gene 3, complete cds 
gi!3676078lemWAJ01 1580IBPS01 1580 P676078] 

(View GenBank reportFASTA report^SN.l reportGraphical view.5 protein 
links, or 2 nucleotide neighbors ) 



AJ0U579 



(View GenBank report,FASTA report^SRl report,Graphical 
links, or 1 nucleotide neighbor ) 



AHB4975 

Bacteriophage H-19B essential recombination function protein (erf), lol 
protein (kil), regulatory protein cm (cHI). protein gpl7 (17). N 
protein (N), cl protein (cl), era protein (cro), ell protein (cIIX O 
protein (OX P protein (P), ren protein (ren), Roi (roi), Q protein (Q). 
Shiga-like toxin A (sll-lA) and B (slt-IB) subimits, and putative holin 
protein (S) genes, complete cds 
gil2668751lgWARB4975l [2668751] 

(View GenBank report,FASTA reportASN.i report,Grapbical view.l MEDLINE 
link. 20 protein links, or 30 nucleotide neighbors ) 



Bacteriophage lambda Rzl protein pr ec ur sor (R2l) gene, complete cds 
gill017780igolU37314IBLU37314 [1017780] 

(View GenBank report^ASTA reporuASN.l repottGraphical view,2 MEDLINE 
links, \ protein link, or 9 nucleotide neighbors ) 



E coli hflA locus encoding the hflX, hflK and hflC genes, hfq gene, 
complete cds; miaA gene, partial cds 
gil436153lgWU0000SECOHFLA [436153] 

(View GenBank report.FASTA reporUASN.l report,Graphical view,4 MEDLINE 




U37314 



U00005 



WO 00/32825 



PCT/1B99/02040 
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links, 5 protein links, or 8 nucleotide neighbors^ 



U32222 



Bacteriophage 186, complete sequence 
gi!3337249igblU32222JBlU32222 [3337249] 

(View GenBank report,FASTA report,ASN.l report.GraphicaJ view,6 MEDLINE 
links, 46 protein links, or 5 nucleotide neighbors ) 



AFD64539 

Bacteriophage NI5, complete genome 
gil3192683lgblAF06453SKAF064539 [3192683] 

(View GenBank report.FASTA reporuASN.l report,GraphicaJ view,2 MEDLINE 
links, 60 protein links, 26 nucleotide neighbors, or 1 genome link ) 



AF063097 

Bacteriophage P2, complete genome 



Bacteriophage phiadh lys, hoi, in(G, rad,and tec genes 
gil2707950leinbfZ97974IBPHIADH [2707950] 

(View GenBank report.FASTA reporuASN.l rcport,Graphical view,2 MEDLINE 
links, 9 protein links, or 1 nucleotide neighbor ) 



AF059243 

Bacteriophage NL9S, complete genome 
gi!30885451gblAFQ59243IAF059243 [3088545] 

(View GenBank report,FASTA reportASN.l report,Graphical vicw,2 MEDLINE 
links, 4 protein links, 3 nucleotide neighbors, or 1 genome link ) 



AF052431 

Bacteriophage Ml I A -protein, coat protein, A 1 -protein, and repiicase 
genes, complete cds 
gil2981208}gWAF05243ll [2981208] 

(View GenBank reportFASTA reportASN.l report,Graphical view,2 MEDLINE 
links, 4 protein links, or 8 nucleotide neighbors ) 



Staphylococcus phage Twort holTW, plyTW genes 
gil2764979lemWY077391BFTWGHOLG [2764979] 
(View GenBank reportJFASTA report j\SN. I report,Graphical view, or 2 
protein finks ) 




Z97974 




Y07739 



X94331 



WO 00/32825 



PCT/IB99/02040 
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Bacteriophage L cro, 24. c2, and cl genes 

gil 14692 l3!emblX94331IBLCR024C [1469213] 

(View GenBank repooFASTA repooASN.l report,Graphical view.l MEDLINE 
link, or 4 protein links ) 



X78410 

Bacteriophage pbiadfa holin and lysio geoes 
gir79384«lernblX78410ILGHOLLYS [793848] 

(View GenBank report.FASTA report^SN.l reporuGraphicaU'iew.l MEDUNE 
link, 2 protein links, or 1 nucleotide neighbor ) 



X99260 

Bacteriophage B103 genomic sequence 
gill429229lemblX99260IBBIG3G (1429229) 

(View GenBank rcportJASTA report.ASN.1 repon,Graphical view,! MEDLINE 
link, 17 protein links, or 12 nucleotide neighbors ) 



AJ000741 

Bacteriophage P! darA ope ran 
gii2462938JembiAJ000741IBPAJ7641 [2462938] 

(View GenBank repottPASTA rcport^SN. I reporuGraphicai view.l MEDUNE 
link* 10 protein links, or 3 1 nucleotide neighbors ) 



X87420 

Bacteriophage ES 18 genes 24, c2. cro, cl, 18, and oL and oR operators 
gilll434O71emblX87420(BPES18GEN [1 143407] 

(View GenBank report,FASTA reporUASN.l report,Grapaical vtew,5 protein 
links, or 9 nucleotide neighbors ) 



L35561 

Bacteriophage phi- 105 ORFs 1-3 
gilS32218tgWU55611PH50RFHTR [532218] 

(View GenBank reportf-ASTA reporiASN.l report,GraphicaJ vicw.l MEDLINE 
link; or 3 protein links ) 



D10027 

Group U RNA coliphage GA genome 
git217784JtojlDl0iOTPGAXX [217784] 

(View GenBank rcport.FASTA rcporuASN.l report,Graphical view. I MEDUNE 
link, 3 protein links, 5 nucleotide neighbors, or 1 genome link ) 



V01128 

Bacteriophage phj-X!74 (cs70 mutation) complete genome 
gill5535iembiV01128IPHIX174 [15535) 

(View GenBank report,FASTA reportASN.I report,Graphical view.4 MEDLINE 
links, 1 1 protein links, or 26 nucleotide neighbors ) 



WO 00/32825 



PCT/IB99/02040 
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S81763 

coar gene...repiicase gene (bacteriophage KUI, host=Escherichia coli, 
group II RNA phage. Genomic RNA, 3 genes, 120 nt) 
gill438766lgWS81763IS81763 [1438766] 

(View GenBank rcpon,FASTA report.ASN.1 report,Graphical view, or 1 
MEDLINE link) 



U38906 

Bacteriophage rl t integrase, repressor protein (no), dUTPase, liolin and 
iysin genes, complete cds 
gill3535171gbiU389061BRU38906 [1353517] 

(View GenBank report,FASTA report,ASN.l report.Graphical view.2 MEDLINE 
links, 50 protein links, or 3 nucleotide neighbors ) 



X9U49 

Bacteriophage phi-C31 DNA cos region 
gilll07473lemWX91149IAPHIC3lC [1107473} 

(View GenBank reportFASTA report ASN.l repon,Grapbical view,l MEDLINE 
link, 6 protein links, or 1 nucleotide neighbor) 



V00642 

phage MS2 genome 
gill5081lemb)V006421LEMS2X [15081] 

(View GenBank reporvFASTA repooASN.l repon,Graphical view.8 MEDLINE 
links, 4 protein links, or 20 nucleotide neighbors ) 



V01 146 

Genome of bacteriophage T7 
giW31l87lemWV01146TT7CG [431187) 

(View GenBank repoitFASTA reporvASN.l report,GraphicaI view,l3 MEDLINE 
links, 60 protein links, 105 nucleotide neighbors, or I genome link ) 



X78401 

Bacteriophage P22 right operon, orf 48, replication genes 18 and 12, nin 
region genes, ninG phosphatase, late control gene 23, orf 60, complete- 
cds, late control region, start of lysis gene 13 
gil5l2343lemWX78401IPOP22NIN (512343] 

(View GenBank reportJASTA reporuASN.l report,Graphicai view,2 MEDLINE 
links, 13 protein links, or 4 nucleotide neighbors ) 



Y00408 

Bacteriophage T4 gene t for lysis protein 
gil 15368lemWY004O8lMYT4T [15368] 

(View GenBank report.FASTA report^SN.l report,Graphicai view,l MEDLINE 
link, 1 protein link, or 3 nucleotide neighbors ) 



Z26590 



< 
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Bacteriophage mv4 lysA and lysB genes 
g il4l0500iemblZ26590!MV4LYSAB [410500) 
(View GenBank report.FASTA report.ASN.1 repon,Grapbical view, or 4 
protein links ) 



X07809 

Phage phiX!74 Ivsis (E) gene upstream region 
g ill5094jemb!XCT7809IMlPHIXE (15094) 

(View GenBank repon.FASTA repoi%ASN.t report.Graphical view.l MEDLINE 
link, 2 protein links, or 4 nucleotide neighbors ) 



Z34528 

LactococcaJ bacteriophage c2 lysin gene 
gi!506455JemblZ34528ILBC2LYSlN (506455) 

(View GenBank reporuFASTA report.ASN.1 report.Graphical view.l MEDLINE 
link. 1 protein link, or 4 nucleotide neighbors ) 



X15031 

Bacteriophage fr RNA genome 
gill50711emWXl5C31lLEBFRX (15071) 

(View GenBank report,FASTA reporuASN.l reporuGrapbical view.l MEDUNE 
link, 4 protein links, 9 nucleotide neighbors, or I genome link ) 



X80191 

Bacteriophage PP7 mRNA for maturation, coat, lysis and replicase 
proteins 

gil517237lerablX80191}BPP7PR (517237) 

(View GenBank recort.FASTA reportASN.l report.Graphical view.l MEDUNE 
link, 4 protein links, or I genome (ink ) 



X85010 

Bacteriophage A511 pty5U gene 
gil8S3748lemWX85010IBPA5UPLY (853748) 

(View GenBank reportFASTA reportASN.l report,Grapbical view.l MEDLINE 
link, 3 protein links, or I nucleotide neighbor ) 

X85009 

Bacteriophage A500 holSOO and ply 500 genes 
gil853744lemWX850091BPA500PLY [853744) 

(View GenBank repott^ASTAreportASNa rcporvGraphical vtcw.l MEDLINE 
link, 3 protein links, or 4 nucleotide neighbors ) 



X85008 

Bacteriophage A 1 1 8 hoi 118 and ply 1 18 genes 

g ilS53740lemblX85008IBPAIi8PLY (853740] , TVT ^ 

(View GenBank reporvFASTA reportASN.l report,Graphical view.l MEDLINE 
link, 3 protein links, or 1 nucleotide neighbor ) 



WO 00/32825 



PCT71B99/02040 



Bacteriophage phi-Xl74 genes for lysis protein and beta-lactamase 
gil520996lemblZ35638iBPLYSPR [520996] 

(View GenBank report.FAST A repbrt,ASN.l report,Grapbicai view,} MEDLINE 
link. 2 protein links, or 516 nucleotide neighbors ) 



J02459 

Bacteriophage lambda, complete genome 
gil2l5104lgblJ02459ILAMCG [215104] 

(View GenBank report.FASTA report,ASN.l report.Graphical view,87 MEDUNE 
links, 67 protein links, 190 nucleotide neighbors, or 1 genome link ) 



X87674 

Bacteriophage PI JydA & lydB genes 
gil974763lemblX876741B ACPI LYD [974763] 

(View GenBank report,FASTA reportASN.l report,Graphical view.l MEDUNE 
link, 2 protein links, or 2 nucleotide neighbors ) 



X87673 

Bacteriophage Pi gene 17 
gil974761lemblX87673IBACPl!7 [974761] 

(View GenBank reportFASTA report^ASN.l report,Graphical view.l MEDLINE 
link, 1 protein link* or 1 nucleotide neighbor ) 



MI4784 

Bacteriophage T3 strain araNG220B right end, tail fiber protein, lysis 
protein and DNA packaging proteins, complete cds 
gil2158iagWM14784IPr3RB [215810] 

(View GenBank report,FASTA reporuASN.l reporUGraphical view.l MEDUNE 
link, 9 protein links, or 10 nucleotide neighbors ) 



MU813 

Bacteriophage PZA (from B.subu'lis), complete genome 
gil216046igblMl 1813IPZACG [216046] 

(View GenBank reportJASTA report^SN.l reporUGraphical view3 MEDLINE 
links, 27 protein links, 17 nucleotide neighbors, or 1 genome link ) 



M16812 

Bacteriophage K3 V lysis gene, complete cds 
gil215503lgblMi6812IPK3LYST [215503] 

(View GenBank report^ASTA reportASN.l report,Graphical view,l MEDUNE 
link, 1 protein link, or 4 nucleotide neighbors ) 



J04356 

Bacteriophage P22 proteins 15 (complete cds), and 19 (3' end) genes 
gil215265lgba04356IP2215P [215265] 



WO 00/32825 



PCT/IB99/02040 
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(View GcoBank report.FASTA repon,ASN.l report.Graphicai view,! MEDLINE 
link, 3 protein links, or 2 nucleotide neighbors ) 



J04343 

Bacteriophage JP34 coat and lysis protein genes, complete cds. and 
rcpUcase protein gene, 5' end 
gil215076lgWJ04343IJP3COLY [2 15076] 

(View GenBank report r FASTA report^SN.l report,Grapbical view,l MEDLINE 
link, 3 protein links, or 2 nucleotide neighbors ) 



J02482 

Bacteriophage phi -XI 74, complete genome 
gii2160l9lgbU02482iPXlCG (216019] 

(View GenBank report,FASTA report.ASN.1 reportGraphical view.23 MEDLINE 
links. 1 1 protein links, 26 nucleotide neighbors, or 1 genome link ) 



M99441 

Bacteriophage T4 anti-sigma 70 protein (asLA) gene, complete cds and 
lysis protein, 3' end 

gil215S20lgblM99441iPT4ASIA [215820] 

(View GenBank reportJFASTA report^SN.l report,Graphical view,3 MEDLINE 
links, 2 protein links, or 2 nucleotide neighbors ) 



M65239 

Bacteriophage 21 lysis genes S, R, and Rz, complete cds 
gil215466»gblM652391PH2LYSGEN [215466] 

(View GenBank report,FASTA report^ SN.l report.Graphicai view, I MEDLINE 
link, 3 protein links, or 1 nucleotide neighbor ) 



MI0637 

Phage G4 D/E overlapping gene system, encoding D (morphogenetic) and E 
(lysts) proteins 

gii215427lgblMl0637IPG4DE [215427] 

(View GenBank repor^FASTA reporuASN.l report,Graphical view.l MEDLINE 
link, 2 protein links, or 12 nucleotide neighbors ) 



J02454 

Bacteriophage G4, complete genome 
gil2l5415lgW02454JP(j4CG [215415] 

(View GenBank reportFASTA reporuASN.l report,Graphical view,6 MEDLINE 
links, 11 protein links. 20 nucleotide neighbors, or I genome link ) 



J02580 

Bacteriophage PA-2 (Ecoli porcine strain isolate) Rz gene, 5end; ORF2, 
outer membrane porin protein (\c) and ORF1 genes, complete cds 
gil2l5366lgbU02580IPA2ljC [215366] 

(View GenBank reportFASTA reportASN.l reporUGraphicai view.l MEDLINE 
link, 4 protein links, or 4 nucleotide neighbors ) 



WO 00/32825 



PCT/1B99/02040 
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M 14782 

Bacillus phage phi-29 head morphogenesis, major head protein, head fiber 

protein, tail protein, upper collar protein, lower collar protein, 

pre-neck appendage protein. morphogenesis( 13), lysis, morphogenesis(15), 

encapsidation genes, complete cds 

gil215323lgbiM14782IP29LATE2 (215323] 

(View GenBank report.FASTAreport,ASN.l report,Graphicai view, I MEDLINE 
link, 1 1 protein links, or 1 1 nucleotide neighbors ) 



Ml 0997 

Bacteriophage P22 lysis genes 13 and 19, complete cds 
gil215262lgblMl0997fP221319 1215262] 

(View GenBank report.FASTA report.ASN.1 report.Graphica} vicw.l MEDLINE 
link, 2 protein links, or 3 nucleotide neighbors ) 



J02467 

Bacteriophage MS2, complete genome 
gil2152321gbU024671MS2CG (215232) 

(View GenBank report JASTA report.ASN.1 report,Graphicai view,8 MEDLINE 
links, 4 protein links, 20 nucleotide neighbors, or 1 genome link ) 



- M 14035 

Bacteriophage lambda lysis S gene with mutations leading to nonielbality 
of S in the plasmid pRG 1 
gil215180lgWM!4CSSLAMLYS (215180] 

(View GenBank report .FASTA reportASN.l report,GraphicaJ view.I MEDLINE 
link, 1 protein link, or 14 nucleotide neighbors ) 



U04309 

Bacteriophage phi-LC3 putative holin (I ys A) gene and putative murein 
hydrolase (lysB ) gene, comple te cds 
gi)530r796}gWU04309IBPU04309 (530796] 

(View GenBank report^ASTA report^SN.l report,GraphicaJ view,l MEDLINE 
link, 2 protein links, or 1 nucleotide neighbor ) 



WO 00/32825 



PCT/IB99/02040 
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Table 13 



NCBI Entrez Nucleotide QUERY 

Key word: holin 

51 citations found (all selected) 



AF034975 

Bacteriophage H-19B essential recombination function protein (erf), kil 
protein (kil), regulatory protein cIII (cIII), protein gpl7 (17), N 
protein (N), cl protein (cl), cro protein (cro), ell protein (ell), 0 
protein (0), P protein (P), ren protein (ren), Roi (roi), Q protein (Q), 
Shiga-like toxin A (slt-IA) and B (slt-IB) subunits, and putative holin 
protein (S) genes, complete cds 
gil2668751lgblAF034975i (2668751] 

(View GenBank report,FASTA report.ASN.1 report t Graphical viewj MEDLINE 
link, 20 protein links, or 30 nucleotide neighbors ) 

U52961 

Staphylococcus aureus holin-like protein LrgA (IrgA) and LrgB (lrgB) 
genes, complete cds 

gill841516lgWU5296tl$AU5296i [1841516] 

(View GenBank report,FASTA reporuASN.l report,GraphicaI view f l MEDLINE 
link, 2 protein links, or 1 nucleotide neighbor ) 

U28154 

Haemophilus somnus cryptic prophage genes, capsid scaffolding protein 
gene, partial cds, major capsid protein precursor, endonuclease, capsid 
completion protein, tail synthesis proteins, holin, and lysozyme genes, 
complete cds 

gill765928lgblU281541HSU28154 [1765928] 

(View GenBank report JASTA report,ASN.l report,Graphical view,l MEDLINE 
link, or 13 protein links ) 



AFG32122 

Streptococcus thermophilus bacteriophage Sfil9 central region of genome 
gtl2935682lgblAFQ32122l [2935682] 

(View GenBank rcportXASTA report ,ASN.l report,Graphical view,l MEDLINE 
link, 14 protein links, or 2 nucleotide nei ghbors ) 

AFQ32121 

Streptococcus thermophilus bacteriophage Sfi21 central region of genome 
gil2935667lgblAF032121lAF032121 [2935667] 

(View GenBank report,FASTA report ASK 1 reportGraphical view, I MEDUNE- ~ 
link, 14 protein links, or 2 nucleotide neighbors ) 



WO 00/32825 



PCT/IB99/02040 
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AF021803 

Bacillus subtilis 168 prophage SPbeta N-acctylmuramoyi-L-aianine amidase 
(blyA), holin-like protein (bhlA), hoiin-like protein (bolB), and yoIK 
genes, complete cds; and yoU gene, partial cds 
gil2997594lgbiAF02l803IAF021803 [2997594] 

(View GenBank report.FASTA report,ASN.l re port, Graphical view.l MEDLINE 
link, 5 protein links, or 1 nucleotide neighbor ) 



AF057033 

Streptococcus thermophilus bacteriophage sfill gp502 (orf502), gp284 
(orf284), gpi29 (orfl29), gpl93 (orf 193), gp!19 (orf 1 19), gp348 
(orf348), gp53 (orf53), gpll3 (orfll3), gpl04 (orfl04), gpU4 (orfll4), 
gp!28 (orf 128), gpl68 (orf 168), gpl 17 (orfll7), gpl05 (orfl05), putative 
minor tail protein (orf 15 10), putative minor structural protein 
(orf512), putative minor structural protein (orf 1000), gp373 (orf373), 
gp57 (orf 57). putative anti-receptor (orf695), putative minor structural 
protein (orf669), gpl49 (orf 149). putative holm (orf 141), putative 
holin (orf87), and lysin (orf288) genes, complete cds 
gi!3320432lgblAF057033tAF057033 [3320432] 

(View GenBank report,FASTA report ASN.l report,Graphical view.25 protein 
links, or 1 nucleotide neighbor ) 



U32222 

Bacteriophage 186, complete sequence 
gil3337249lgblU32222IBlU32222 [3337249] 

(View GenBank report.FASTA report,ASN.I report.Graphical view,6 MEDLINE 
links, 46 protein links, or 5 nucleotide neighbors ) 



AB009866 

Bacteriophage phi PVL proviral DNA, complete sequence 
gil33419O7ldbjlAB009866IAB009866 [3341907] 

(View GenBank report,FASTA report.ASN.1 report.Graphical view,63 protein 
links, or 1 nucleotide neighbor ) 



AF009630 

Bacteriophage WL170, complete genome 
gi!328226Olgb!AF009630!AF0O9630 [3282260] 

(View GenBank report,FASTA report,ASN.l report.Graphical view,63 protein 
links, 3 nucleotide neighbors, or 1 genome link ) 



AF064539 

Bacteriophage N15, complete genome 



WO 00/32825 
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git3l92683lgblAF064539iAF064539 (3192683] 

(View GenBank report,FASTA report^ASN.l report.Graphical vicw,2 MEDLINE 
links, 60 protein links, 26 nucleotide neighbors, or i genome link ) 



AF063097 

Bacteriophage P2, complete genome 
giQ139086lgbiAF063097lAF063097 [3139086] 

(View GenBank report,FASTA report^SN.l report,Graphical view,2l MEDLINE 
links, 42 protein links, 3 nucleotide neighbors, or 1 genome link ) 



Z97974 

Bacteriophage phiadh lys, hoi, intG, rad.and tec genes 
gil2707950lemblZ97974IBPHIADH [2707950] 

(View GenBank report,FASTA report ASN. I report,GraphicaI view,2 MEDLINE 
links, 9 protein links, or 1 nucleotide neighbor ) 

X9S646 

Streptococcus thermophilus bacteriophage Sfi2I DNA; lysogeny module, 
8141 bp 

gil22927471emUX95646lBSH2lLYS [2292747] 

(View GenBank report,FASTA report,A$N.l report,GraphicaI view,2 MEDLINE 
links, 19 protein links, or 3 nucleotide neighbors ) 



SEGJXHLYSINO 

Bacteriophage LL-H structural protein gene, partial cds; minor 
structural protein gp61 (g57), unknown protein, unknown protein, 
structural protein (g20), unknown protein, unknown protein, major capsid 
protein (g34), main tail protein gpl9 (g!7), holin (hoi), muramidase 
(mur), unknown protein, unknown protein, unknown protein, unknown 
protein, unknown protein, and unknown protein genes, complete cds; 
unknown protein gene, partial cds; and unknown protein, unknown protein, 
unknown protein, unknown protein, unknown protein, minor structural 
protein gp75 (g70), minor structural protein gp89 (g88), minor 
structural protein gp58 (g71), unknown protein, unknown protein, unknown 
protein, and unknown protein genes, complete cds 
gil 1004337lgbllSEG_LLHLYSINO [ 1004337] 

(View GenBank report JASTA report ASN.l report,Graphical view,4 MEDLINE 
links, 3 1 protein links, or 1 nucleotide neighbor ) 



M96254 

Bacteriophage LL-H holin (hoi), muramidase (mur), and unknown protein 
genes, complete cds 

giil0043361gb)M96254)LLHLYSlN03 [1004336] 

(View GenBank reportFASTA report^SN.l report, or Graphical view) 



WO 00/32825 



PCT/IB99/02040 
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Y07740 

Staphylococcus phage 187 plyl87 and hoII87 genes 
gil2764982lemblY07740IBP187PLYH [2764982] 
(View GenBank report.FASTA report,ASN.l report.Graphical view, or 2 
protein links ) 



U88974 

Streptococcus thermophilus bacteriophage 01205 DNA sequence 
gi)2444080lgblU88974l [2444080] 

(View GenBank report,FASTA reporU\SN.l report,GraphicaI view,l MEDLINE 
link, 57 protein links, or 6 nucleotide neighbors ) 



Z99117 

Bacillus subtilis complete genome (section 14 of 21): from 2599451 to 
2812870 

gil2634966lemb!Z991 171BSUB0014 [2634966] 

(View GenBank report,FASTA report,ASN.l report.GraphicaJ view,233 
protein links, 51 nucleotide neighbors, or 1 genome link ) 



Z99115 

Bacillus subtilis complete genome (section 12 of 21): from 2195541 to 
2409220 

gil2634478lemWZ991151BSUB0012 [2634478] 

(View GenBank report,FASTA reporUASN.l report,Graphical view,244 
protein links, 64 nucleotide neighbors, or 1 genome link ) 



Z99110 

Bacillus subtilis complete genome (section 7 of 21): from 1 194391 to 
1411140 

gil2633472!emblZ991101BSUB00O7 [2633472] 

(View GenBank reportJFASTA reportw^SN.l report.Graphical view,226 
protein links, 3 1 nucleotide neighbors, or 1 genome link ) 



X78410 

Bacteriophage phiadh holin and lysin genes 
gil793848lemb!X784101LGHOLLYS [793848] 

(View GenBank report^ASTA report^SN.l report.GraphicaJ view,l MEDUNE 
link, 2 protein links, or 1 nucleotide neighbor ) 



Z93946 



WO 00/32825 



PCT/IB99/02040 



258 

Bacteriophage Dp- 1 dph and pal genes and 5 open reading frames 
g iU934760lemblZ93946IBPDPlORFS (1934760] 
(View GenBank report.FASTA report ASN.l report.Graphical view, or 6 
protein links ) 



AF0U378 

Bacteriophage ski complete genome 
gil2392824igblAF011378!AFOH378 (2392824] 

(View GenBank report,FASTA report,ASN.l report.Graphical view,54 protein 
links, 2 nucleotide neighbors, or 1 genome link ) 



Z47794 

Bacteriophage Cp-l DNA, complete genome 
gil2288892lemblZ47794)BPCPlXX [2288892] 

(View GenBank report,FASTA report,ASN.i report .Graphical view 3 MEDLINE 
links, 28 protein links, 1 nucleotide neighbor, or 1 genome link ) 



135561 

Bacteriophage phi-105ORFs 1-3 
gil532218lgb!U5561IPH50RFHTR (532218] 

(View GenBank report.FASTA report ASN.l report,Graphical view,l MEDLINE 
link* or 3 protein links ) 



D49712 

Bacillus licheniformis DNA forORFs, xpaL2 homologous protein and xpaLl 
homologous protein, complete and partial cds 
gill514423ldbjlD49712ID49712 [1514423J 

(View GenBank report .FASTA reporuASN.l report,Graphical view,2 MEDLINE 
links, or 4 protein links ) 



X90511 

Lactobacillus bacteriophage phigle DNA for Rorf 162, Holin, Lysin, and 
Rorf 175 genes 

gill926386iemblX9051 ULBPHMOL [1926386] 

(View GenBank reporU-ASTA report.ASN.1 reportGraphical view,4 protein 
links, or I nucleotide neighbor ) 



X98106 

Lactobacillus bacteriophage phigle complete genomic DNA 
gill9263201emblX98106ILBPHIGlE (1926320] 

(View GenBank report^ASTA report^SN.l report,Graphical view.l MEDLINE 



WO 00/32825 



PCT/IB99/02040 



259 

link, 50 protein links, or 4 nucleotide neighbors ) 



U72397 

Bacteriophage 80 alpha holin and amidase genes, complete cds 
gi!1763241lgblU723971B8U72397 [1763241] 

(View GenBank report,FASTA report,ASN.l report,Graphical view ,2 protein 
links, or 2 nucleotide neighbors ) 



U38906 

Bacteriophage rit integrase, repressor protein (rro), dUTPase, holin and 
lysin genes, complete cds 
gi!1353517lgblU38906IBRU38906 [1353517] 

(View GenBank report,FASTA report^SN.l report,Graphical view,2 MEDLINE 
links, 50 protein links, or 3 nucleotide neighbors ) 



X91149 

Bacteriophage phi-G 1 DNA cos region 
gililC7473lemblX911491APHIC3lC [1107473] 

(View GenBank report JASTA report ASN.l report,Graphical view.l MEDLINE 
link, 6 protein links, or 1 nucleotide neighbor ) 

U24159 

Bacteriophage HP1 strain HPlcl, complete genome 

gi!104623SgblU24l59IBHU24159 [1046235] ^ ^ 

(View GenBank report,FASTA report,ASN.l report.Graphical view,6 MEDLINE 
links, 41 protein links, 8 nucleotide neighbors, or 1 genome link ) 



Z26590 

Bacteriophage mv4 lysA and lysB genes 
giW10500lemblZ26590IMV4LYSAB [410500] 

(View GenBank report, FAST A report ASN.i report,Grapnical view, or 4 
protein links ) 



Z70177 

B^ubtilis DNA (28 kb PBSX/skin element region) 
giU225934lemblZ70177IBSPBSXSE [1225934] 

(View GenBank reportJASTA report,ASN.l report,Graphical view32 protein 
links, or 4 nucleotide neighbors ) 



Z36941 
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B.subtilis defective prophage PBSX xhlA, xhlB, and xylA genes 
gil535793!emblZ36941IBSPBSXXHL [535793] 

(View GenBank report,FASTA reponj\SN.l report,Grapbical view ,4 protein 
links, or 5 nucleotide neighbors ) 



X89234 

L.innocua DNA for phagelysin and holin gene 
gilil34844lemblX89234ILICPLYHOL [1134844] 

(View GenBank report.FAST A report,ASN.l report.Graphical view.l MEDLINE 
link, 2 protein links, or 4 nucleotide neighbors ) 



X85010 

Bacteriophage A51 1 ply51 1 gene 
giia537481emblX85010IBPA5HPLY [853748] 

(View GenBank report,FASTA report,ASN.l report.Graphical view.l MEDUNE 
link, 3 protein links, or 1 nucleotide neighbor ) 



X85009 

Bacteriophage A500 ho!500 and ply500 genes 
gil853744lemblX85009IBPA500PLY [853744] 

(View GenBank report,FASTA report ASN.l report.Graphical view.l MEDLINE 
link, 3 protein links, or 4 nucleotide neighbors ) 



X85008 

Bacteriophage Al 18 hoi 1 18 and ply 1 18 genes 
gi!853740lembIX85008IBPA118PLY [853740] 

(View GenBank report^ASTA report,ASN.l report,Graphical view,l MEDLINE 
link, 3 protein links, or 1 nucleotide neighbor ) 



L34781 

Bacteriophage phi 1 1 holin homologue (ORF3) gene, complete cds and 
peptidoglycan hydrolase QytA) gene, partial cds 
gil5il838lgbIL347811BPHHOUN [511838] 

(View GenBank report^ASTA report.ASN.1 report,Graphical view,l MEDLINE 
link, 4 protein links, or 2 nucleotide neighbors ) 

U 11698 

Serratia marcescens SM6 extracellular secretory protein (micE), putative 
phage lysozyme (nucD), and transcriptional activator (nucQ genes, 
complete cds 

gil509550lgblUI 16981SMU1 1698 [509550] 

(View GenBank reporuFASTA report.ASN.1 report,Graphical view,! MEDLINE 



WO 00/32825 



PCT/IB99/02040 



261 



link, 3 protein links, or 1 nucleotide neighbor ) 



U31763 

Serratia marcescens phage-holin analog protein (regA), putative phage 
lysozyme (regB), and transcriptional activator (regC) genes, complete 
cds 

gi!965068lgblU31763ISMU31763 [965068] 

(View GenBank report,FASTA reporuASN.l report.Graphical view.l MEDLINE 
link, 3 protein links, or 1 nucleotide neighbor ) 



X87674 

Bacteriophage PI lydA &. lydB genes 
gil974763lemblX87674JBACPlLYD [974763] 

(View GenBank report,FASTA report.ASN.1 report,Graphical view.l MEDUNE 
link, 2 protein links, or 2 nucleotide neighbors ) 

L48605 

Bacteriophage c2 complete genome 
gi!1146276lgblL48605IC2PVCG [1 146276) 

(View GenBank report,FASTA report,ASN.l report.Graphical view3 MEDLINE 
links, 39 protein links, 3 nucleotide neighbors, or 1 genome link ) 



L33769 

Bacteriophage bIL67 DNA polymerase subunit (ORF3-5), essential 
recombination protein (ORF13), lysin (ORF24), minor tail protein 
(ORF31), terminase subunit (ORF32), holin (ORF37), unknown protein (ORF 
1-2,6-12,14-23^5-3033-36), complete genome 
gi!522252lgblL33769!L67CG [522252] 

(View GenBank report,FASTA report,ASN.l report,Graphical view,l MEDLINE 
link, 37 protein links, 2 nucleotide neighbors, or 1 genome link ) 



U1348 

Bacteriophage Tuc2009 integrase (int) gene, complete cds; lysin (lys) 
gene, 3' end 

gil508612JgblL3 13481TU2INT [508612] 

(View GenBank reportJASTA reportASN.l report,Graphical view,2 MEDUNE 
links, 3 protein links, or 3 nucleotide neighbors ) 



L31364 

Bacteriophage Tuc2009 holin (S) gene, complete cds; lysin (lys) gene, 
complete cds 

gil496281lgblL3 1364ITU2SLYS [496281} 
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(View GenBank report,FASTA reporuASN.i report,Grapoical view,l MEDLINE 
link, 2 protein links, or I nucleotide neighbor ) 



L31366 

Bacteriophage Tuc2009 structural protein (mp2) gene, complete cds 
gil496278lgblU1366ITU2MP2A [496278] 

(View GenBank report^ASTA report,ASN.l repon,Graphical view.l MEDLINE 
link, 2 protein links, or 1 nucleotide neighbor ) 



L31365 

Bacteriophage Tuc2009 structural protein (mpl) gene, complete cds 
gil496276igblL3 13651TU2MP1A [496276] 

(View GenBank report,FASTA report,ASN.l report.Graphical view, I MEDLINE 
link, or 1 protein link ) 



U04309 

Bacteriophage phi-LC3 putative holin (lysA) gene and putative murein 
hydrolase (lysB) gene, complete cds 
gil530796lgblU04309IBPU04309 [530796] 

(View GenBank report,FASTA report,ASN.l report,Graphical view, I MEDUNE 
link, 2 protein links, or 1 nucleotide neighbor ) 
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Table 14 



NCBI Entrez Nucleotide QUERY 
Key word; bacteriophage and kil 
5 citations found (all selected) 



AF034975 

Bacteriophage H-19B essential recombination function protein (erf), kil 
protein (kil), regulatory protein clll (cIII), protein gpl7 (17), N 
protein (N), cl protein (cl), cro protein (cro), ell protein (cll),0 
protein (0), P protein (P), ren protein (ren), Roi (roi), Q protein (Q), 
Shiga-like toxin A (slt-IA) and B (slt-lB) subunits, and putative holin 
protein (S) genes, complete cds 
gil2668751lgblAF034975 [2668751] 

(View GenBank report,FASTA report.ASN.1 report.Graphical view,! MEDUNE 
link, 20 protein links, or 30 nucleotide neighbors ) 

X15637 

Bacteriophage P22 P(L) operon encompassing ral, 17, kil and atf genes 
gi!15646lemblX15637IPOP22PL [15646] 

(View GenBank reportJASTA report^SN.l report.Graphical view, I MEDLINE 
link, 7 protein links, or 2 nucleotide neighbors ) 



J02459 

Bacteriophage lambda, comi ?lete genome 
gi}215104lgblJ024591LAMCG [215104] 

(View GenBank report,FASTA reportASN.l report,Graphical view,87 MEDLINE 
links, 67 protein links, 190 nucleotide neighbors, or 1 genome link ) 



M64097 

Bacteriophage Mu left end 
gil215543lgblM64097IPMULEFTEN [215543] 

(View GenBank reportFASTA report,ASN.l report,Graphical view,2 MEDLINE 
links, 39 protein links, or 15 nucleotide neighbors ) 



M18902 

Bacteriophage DI08 kil gene encoding a replication protein, 3' end; and 
containing three ORFs, complete cds 
gi!166191igWM18902iD18KIL [166191] 

(View GenBank report,FASTA report,ASN.l report,Graphical view,l MEDLINE 
link, I protein link, or 3 nucleotide neighbors ) 
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Table 15 
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Table 16 

Phage 44AHJD complete genome sequence. 16668 nucleotides. 

1 tccatttctt tactaaactt aaaaatgctg tgcaacaact taaccaactt atctaaccta ttacatattc 

71 atcaaataca aaacttatgt atctattgac ttttattcaa aateatgatt tcaacatata ataaaattaa 

141 tttacttact eaaatattct atgatataat tagttaeaaa atattcggag gegtataaat gacagaattt 

211 gatgaaatcg caaaaccaga cgacaaagaa gaaaccccag aaccaactga agaaaatcca gaaccaactg 

281 aagaaactcc agaatcaact gaagaatcaa ctgaagaacc aactgaagaa tcaactgaag acaaaacagt 

3S1 agaaacaacc gaagaagaaa atgaaaacaa attagaacct actacaacag atgaagatag ttcgaaattc 

421 gaccctgttg tattagaaca acgtattget tcattagaac aacaagtgac tactttttta ccttcacaaa 

491 tgcaacaacc acaacaagta caacaaacac aatcagatgt aacagaacca aacaaagaag acaacgacta 

561 ttcagacgaa gaaccagttg ataagttaga tttagatcag gaggaactta aacatgtatg agggaaacaa 

631 catgcgttct atgatgggta catcatatga agacccaaga ttaaataaac gaacagaatt aaatgaaaac 

701 atgtcaattg acacaaataa aagtgaagat agccatggtg tacaaattca ttcacttcca aaacaatcat 

771 ctacaggtga cgttgaggag gaataataaa ttatggcaca acaatctaca aaaaacgaaa ccgcactttt 

841 agtagcaaag tcagctaaat cagcgttaca agattttaat cacgatcatt caaaatcttg gacatttggc 

911 gacaaatggg ataattcaaa tacaatgttc gaaacatttg taaataaata tttattccct aagattaatg 

981 agactttatt aatcgatatt gcactaggta atcgttttaa ttggttagct aaagagcaag attttattgg 

1051 acaatatagt gaagaatacg cgactatgga cacagtacca accaacatgg acetatctaa aaatgaggaa 

1121 ttaatgttga aacgtaatta tccacgtatg gcaactaagt tatatggtaa cggaattgeg aagaaacaaa 

1191 aattcacatt aaacaacaat gacacacgtt tcaatttcca aacattagca gacgcaacea attacgcttt 

1261 aggtgtatac aaaaagaaaa ttcctgatat taatgtatca gaagaaaaag aaacgcgcgc aatgctagtt 

1331 gattactcat tgaatcaatt atccgaaaca aatgtacgta aagcaacacc aaaagaagat ttagcaagca 

1401 aagtttttga agcaaeccta aacttacaaa acaacagtgc caaatacaat gaagtacatc gtgcatcagg 

1471 tggtgcaatc ggacaatata caactgtatc aaaaetaaaa gatattgtga ttttaacaac agattcatta 

1541 aaatcttatc cttcagatac taagattgca aacacactcc agattgcagg cattgatctc acagatcacg 

1611 ccactagtcc cgacgactta ggtggcgtgt ttaaagcaac aaaagaactc aagctacaaa accaagatcc 

1681 aattgacttc ttacgtgcgt atggagacta tcaatcacaa ttaggagata caattccagt tggtgctgta 

1751 tttacttatg atgtaeceaa acttaaagag tttactggca acgctgaaga aattaaacca aaatcagatt 

1821 Catatgcgtt tattttggat attaattcaa ttaaatataa acgctacaca aaaggtacgt taaaaccacc 

1891 attccataac cctgaacttg atgaagttac acactggatt cattactatt catttaaagc cattagtcca 

1961 ccccttaata aaattttaat tactgaccaa gatgtaaatc caaaaccaga ggaagaaeta caagaataaa 

2031 aggagcgtaa aacatgaaca acgacaaaag aggtttaaac gttgagttat caaaggaaat cagcaaaaga 

2101 gttgttgaac atcgcaacag atttaaacgt cttatgttta atcgttattt ggaattttta ccgctactaa 

2171 tcaactatac caatcgtgat acggttggta tagattttat tcagttagaa tcagctctaa gacaaaacat 

2241 taatgtagtt getggtgaag ctagaaataa gcaaattatg attcttggtt atgtaaataa cacttacttt 

2311 aatcaagcac caaatttttc atcaaacttt aatttccaat ttcaaaaacg attaaccaaa gaagatatat 

2381 actttattgt acctgactat ttaatacctg atgattgtct acaaattcat aagctatacg ataactgtat 

2451 gagtggtaac cttgtcgcca cgcaaaacaa accaattcaa tacaacagtg atatagaaat cacagaacat 

2521 tatactgatg aattagcaga agttgcttta tctcgctttt ctttaatcat gcaagcaaaa ettagcaaga 

2591 tatttaaatc agaaattaat gacgagtcaa tcaatcaact tgtgtccgaa atatacaacg gtgcaccatt 

2661 tgttaaaatg tcacctatgt ttaatgcaga tgacgatatc attgatttaa caagtaatag cgtaatccca 

2731 gcactaactg aaatgaaacg ggaatatcaa aacaaaatta gcgaaccaag caactattta ggcattaatt 

2801 catcagccgt tgataaagaa agcggtgttt cagacgaaga ggcaaaaagt aatcgtggat ttaccacatc 

2871 aaacagtaac atctatttaa aaggtcgtga accaattacg tttttatcaa agcgttatgg tttagatatt 

2941 aaacegtatt acgatgatga aacaacgtct aaaatatcaa tggtagacac actttttaaa gatgaaagca 

3011 gcgatacaaa tggctagata cacaatgace ttatacgatt tcattaaatc agaattgatt aaaaaaggtt 

3081 tcaatgaatt tgtaaatgat aataaattaa cgttttatga tgacgaattt caattcacgc aaaaaatgct 

3151 gaagtccgac aaagacgttt tagctatcgt taatgaaaaa gtatttaaag gtttttcatt gaaagatgaa 

3221 ccatcagatt tactttttaa aaaatcactc acgattcatt ttttagatag agaaatcaac agacaaacag 

3291 ttgaagcatt tggcatgcaa gtgattactg tatgtattac acatgaggat tattcaaacg tggtttattc 

3361 accaagtgaa gttgaaaaat acttacaatc acaaggcttc acagaacaca atgaagatac aacaagtaac 

3431 actgatgaaa caccgaacca aaaegctaca tctttagaca atccaactgg catgactgca aacagaaacg 

3501 cttatgtgtc attaccacaa agtgaggtta acaecgacgt tgacaataca acgttacgae tcgctgataa 

3571 taatacgatt gataacggta aaactgcgaa taaatcgagt aacgaaagta atcaaaacgc aaaacgtaat 

3641 caaaaccaaa aaggtaatgc aaaaggcaca caattcacta agcagtattc aattgataat actgataaag 

3711 cgtacgattt aagaaagaaa aetttaaatg aacccgacaa aaaacgctcc ctacaaacct ggtagaggtg 

3781 gttaaataat ggcacacaat gaaaacgatt ttaaatattt cgacgacatt cgeccatttt tagacgaaat 

3851 ttataaaacg agagaacgct acacaccgtt ttacgatgat agagcagact acaacacCaa ttcaaaatca 

3921 tat cat gat t acatttcaag attatcaaaa ctaattgaag tactagcacg tcgtatctgg gactatgaca 

3991 atgaattaaa aaaacgtttc aaaaattggg acgacctaat gaaagcattc ccagagcaag cgaaagactt 

4061 atttagaggt cggttaaacg acggcacgat tgacagtact atccacgacg agtctaaaaa atacagcgca 

4131 ggattaacat cggcatctgc cttatttaaa gctactgaaa cgaaacaaat gaatgactce aaatcagaag 

4201 ttaaagactt aattaaagat attgaccgtt tcgttaatgg gtttgaatca aatgagcttg aaccaaagtt ~- 

4271 cgrgatgggc tttggtggta ttcgcaacgc agctaaccaa tctaccaata ttgataaaga aacaaatcac 

4341 atgtactcca cacaatccga ttctcaaaaa cctgaaggct tteggataaa taaaetaaca cctagtggtg 

4411 acttaatttc aagcatgcgt attgtacagg gcggccacgg tacaacaatc ggattagaac grcaatccaa 

44 81 tggcgaaacg aaaacctggc cacatcacga tggtgttgca aaactgccac aagtcgcaca taaagataat 

4551 tatgtateag actcagaaga ggctaaaggt ttaacagatt acacaccaca gtcaccccca aacaaacaca 

4 621 cacttacacc getaattgat gaagcaaatg acaaacccac tttaagattc ggtgacggaa caatacaggt 

4691 tcgttcaaga gcagacgtaa aaaatcacat tgataatgta gaaaaagaaa tgacaattga eaattcagaa 
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4 761 aacaatgata atcgttggat gcaaggcatt 

4831 gttcagctaa ttcacatgtt caaatcggta 

4 901 atteaagtta tcatatcaag acggtattaa 

4 971 atttatacaa acccaaaaac aaaacgcaaa 

5041 gtttccataa tttatatggt ttcttccaac 

Sill ttcacaaaac tacaaattaa caaaagacga 

5181 aatgacttaa cgcaagctgg tttttattat 

5251 egaaeggtag caagcgtaca actgacgccg 

5321 tacggcccaa gaattaacac gtctctcaac 

5391 gacgtattta cgttaaaatg ggattatgga 

5461 atttggaagc aagtcaatac aataactgga 

5531 taaccaaacg gaattattta gagacgcgcc 

5601 agtggtaacg cagccggcga agcaagacaa 

5671 gtaatgttaa tgcggaaaca aaacatcgtg 

5741 taaatgaaat cacaacaaca agcaaaagaa 

5811 gtgcatatgg atttcaatgt atggacttat 

5881 catgtggggt aatgctaaag acgcgataaa 

5951 ccgagcttta aacctcaatt aggggacgct 

6021 tgttaagtgg aaaccttgac tattatacat 

6091 ggaaaaagca accattagaa cacactatta 

6161 agtaatagca aagcattaga aacatcaaaa 

6231 catattatag aaatgaaaat ggtacattta 

6301 aaaattatca gaacctaatg gctattggtt 

6371 tcagatggtt acgtatggat tggttataac 

6441 atggaaaaac aggtaatagt tacagtgttg 

6511 tttttctttg aatttagttg gaaaagatac 

6581 gtttaagaca agctgaaaca atcgaacgct 

6651 ttttctataa cacaccgctt acagaccatc 

6721 ttatttttta aatggtcgtc attttaaatc 

6791 agaacggaaa tcaatgttga tatgcagtgg 

6861 attttgagga cagaagatat tacgcttttg 

6931 atattttgtc attgatacca ttatgacgta 

7001 attgaacgtc aacatttatc aaaacgcacg 

7071 cgttaaaagt atcaaataaa aaccatgttt 

7141 ccagtcaagc gctgatttat caaagaaact 

7211 acgatctatg acaatatcac accaccagtc 

7261 tggataaaac gagtgcctat ccatggatca 

7351 tattaataca aaagactcag aggacgttaa 

7421 ggtggtaaat caaaagaatg gagtctaaaa 

7491 tatctaaaaa agatgaattt aaacatatga 

7S61 tggaaacacg aegttactcg acgccggcaa 

7631 actattggtt atcataatga agttcgagta 

7701 tactcgctaa aaataaagaa atattgattg 

7771 ttttgcacaa gtaccaatat taatcaacaa 

7841 aacgcagaaa gccaactaat tacaaaccgt 

7911 tttatgacgc tgtgagtgta gcaagtaatt 

79 Bl taatttctac aaacaacaac aagctgaaca 

8051 gaaacgggca acgcatccca aaccgcgaat 

8121 ctaaagaaat tacattttca caaaaacatt 

8191 tattgaacca attaacagta tgactgtttg 

8261 atcgacccca tgttaatgga acaattaaaa 

8331 gttcaggtaa tccaacgtca caaaatccat 

8401 taaa*ttcag acttacagac tcagaagcgt 

8471 ctttttattt gtattaatgt tcgttgatat 

8541 tggtcaaaaa aatcaatgag aggacctcct 

8611 tcaccgacca gactttacaa ctaaaaggtg 

8681 gggaccttct attgtagaaa attgtgcaga 

8751 agagtcatta aaaacgacac tgaaaagagc 

8931 cattcctata ttaatgatga ctetggttta 

8891 gctttaatgt atggcttggt ggtaatgaaa 

8961 tgtttcacct actctttttg ccgtatatga 

9031 catacgtctg cacgtggtga ttatttaaca 

9101 aacaagctgg acaaccgtct tggtacgacg 

9171 aaaaggtaat gcagatcctg caaaaaatat 

9241 gcagctgcta cttgggcggc atattatcct 

5311 gtaatccact tctagacggt gcgaatacta 

9381 acctagtgat tcgcctgaca gtggtagcag 

9451 gccatgcaag aaetattaaa aaaaatacaa 

9521 aattttttag taatgattat tttacattag 

9591 tggtttactt gattcattaa aaaaaccgac 

9661 cctactgatg atgacggaga ccataaacca 

9731 gtgtgattgg tggtaactgg acacatgcac 

9801 attcaaaaaa gaacacttat acaaaccagg 

9871 gaaccaacac gggcgtacat gtcacaacca 

9941 acggccagcg tgtatggtac gtctataaaa 

10011 tggtttctcc agtaaaccac catacctaca 
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gctgccgatg gcgatgatct atactggtta agtggtaaca 
aacacccacc aacaacaggc caaaagactt atgattatcc 
cttcccacgt gacaacttta aagagcctga gggtatttgc 
tcgtcattac tcgctatgac aaacggcggt ggtggaaaac 
ttggtgagca tgaacacttt gaagcattac gcgcaagagg 
cggtcgtgca ttatctattc cagaccatat cgacgatcta 
actgacgggg gtactgcaga aaaacttaag aatatgccaa 
grtgcttcac taacgtatac cctacaacac aaacattagg 
aggccgtaaa atggctaaaa tggtgcgtgg tatgacttta 
ttatggacaa caatcaaaac tgacgcacca tatcaagaat 
tcgcttatgt aacaacagct ggtgagtatt acattacagg 
agaagaaatc aaaaaagtgg gtgcatggtt acgtgcgtca 
acattagagg ctaatatatc ggaatataaa gaattcttca 
aatatggttg ggtagcaaaa caccaaaaat aggagtgata 
tggatataca agcatgaggg ggcaggtgtt gactttgatg 
cagttgctca tgtgtattac attactgacg gtaaagttcg 
taatgacttt aaaggtttag cgacggtgta taaaaataca 
gctgtatata caaacggaca atatggacat attcaatgtg 
gcttagaaca aaaetggtta ggcggcggtt ctgacggtcg 
tgacggtgta acccacttca ttagacctaa attttcaggt 
gtaaatacat ttggaaaatg gaaacgaaac caataeggca 
catgtggttt tttaccaata tttgcacgtg tcggtagtcc 
ccaaccaaac ggtcatacac cacacaacga agtttgttta 
tggcaaggca cacgttacta tttaccagtg cgccaatgga 
gtattccttg gggggtgttc tcataatggg tattttagcc 
aaataagagg tgcaaacaat ggctgataga atcgtaagaa 
tattggagga aaaaaatgag aaagttaacg aattttaagt 
aaaacacgat tcattttaat agtaataaag aacgcgatga 
gttagaccat tcaaaacaac cgtataattt tatacgtgat 
cacgacgcac aaggtattaa ceacaegacg tttttaccag 
taaaccaaat cgaacacgtg aatgacgttg tggttaaaat 
tacacaaggg aacgtatcag agcaactctc aaacgtcaat 
tataactata tgttaccaat gttacgtaat aatgatgatg 
ataaccaaat gcaacaatat ttggaaaact tagtattatt 
cggtaetaaa aaagagccaa acttagatac gteaaaaggt 
aacttatacg tcatggaata tggtgacttt attaacttta 
cgcaaaactt tcaaaaggtt caaacgttac ctaaagacct 
aaccagtgaa aaaactacag gattaaaaac attaaaacag 
gatttatcat taagtttetc aaatcttcaa gagatgatgt 
tacgtaatga gcatatgaca attgaatttt acgactggaa 
gatttcacaa aaaactggtg ttaagteacg tacaaaatca 
tatccagtag attataacag tgctgaaaac gacagaccaa 
atacgggttc actcctaaat acaaatataa catttaatag 
cggtatccta ggacaatcac aacaagccaa ccgacaaaaa 
attgataatg tatcaaatgg tagcgacccg aaatcacgct 
taagtccaac tgctttattt ggtaagttta atgaagaata 
taaagattca gccttacaac eacettctgt aactgaatca 
agcatcaacg gcctaacgat gaaaatcagt gtacegtcae 
atatgttgtt tggttttgaa gtgaatgact acaatteact 
caattattta aaatgtacag gtacgtatac tatacgtgac 
gcaattttag aatctggtgt aagattttgg cataatgacg 
caaacaacaa atttagagag ggggtataat atgaacgaag 
ttcacatgtt tatatacgct ggggatttaa aattactcta 
tattacaggt acttcaaaag caateaaaaa taataactta 
aaaaaatcac tgatattctg tattatcatt ttagcaaaca 
gtctactcat gattacaata ttttattata ttgcaaatga 
aatggacgta ttagtaccag aacaaactaa agataaatta 
gataacaatg aacgatcaag agaagataga taaatttacg 
aegaeagacc agctagtccc taaagtaaaa ggatatgggc 
gtaaaatcag acaagtatta aaagcagtaa aagagatagg 
aaaaaatgag ggttttagtt ctggacttgg ttggttaaac 
gatgccaaat tcatagcaag aaagttagta tcacaatcaa 
caggtaacat cgcccacttt gtaccacaag acgtacaaag 
gaaagcaggt acaattggae gtgcatatat tccattaaca 
ttaggtttga aagcatcata taacaaagta caaaactatg 
ttctagcccg gggtggtaaa ttagacggca aaggtggatc 
cggtgacagc ggtagttcac tactcgctct agcaaaacaa 
gacgcattac aacgggacgt tcatagtatt ggtagtgata 
aaaaaacatt taacaacaca tatcatatta aaatgacgat.. - ■ 
cgatagcgcc caagtagaca gtgggagrag tagttccaat 
attagtggca aatcagtcaa gccaaatgga aaaagtggte 
agttaccaga aaaatataaa aaagcaattg gtgtaccttt 
taacatattt ccccaaacgg gtaatrgcagg acaatgtaca 
catggtaaaa gacaacctac cgacgacggt caaataacaa 
agccaggtgc aaaaacaaca cataatccaa cagtaggtta 
agcaactgca tacggtateg gtcacacagg tgttgttgta 
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10081 gcagtttttg aagacggttc gtttecagtt gcaaactata acgtaccacc atatgetgca ccatcacgcg 

10151 cggcaccgca cacactcatt aatggcgtac caaacaacgc cggcgataac atcgcattce ttagtggtat 

10221 tgcttaatta accatgctat aaCgaacaca tgctagcaat gccagtaaat aaaatacaaa acataatcaa 

10291 ttttcgtaca cattettcat gttatcccaa aaagaaaagg agaccgttac tttaacagtc gccttttttt 

10361 atctcatcat gttcacgttt taatatatgc aaaccagatt tgccacgtac tgaacgttca actggaaata 

10431 agtcgttaag cgaaaacgaa ccgaegtcac tteeaatata aagaacacca tcaaattgac tatggrtcgaa 

10501 attttcccta gcgcccccta acataaattc acgccccaca tcaagttcac cagcaaaaca tccaccatat 

10571 acattaccac acacaatctc agttttagac ggatatatcg atactgcacc ttgctcacta tagatacttt 

10641 tategcetcc aataaeggca ccgtcaaaga attgtccacg cacaaaggct ecaaaatcga cgcctgtatc 

10711 aaaggcgctt ttcggtatac cagcagaagc aattttaatc cttccattca cttcatatgc atacttctta 

10781 egatccagca caaacatctt acctatctgt tcgtccccaa eaccccacct acctaaggct atcgggtcga 

10851 ataaactggg gctcaataag ggtttaacaa cggacttcac acacaaacca ccagcaccgc aataaataaa 

10921 actgtcgtca atttcacttt ccgttaagta ctggaaagga accaacaagc tatacaaega acgtgacgtg 

10991 acaaatgtag agaataatat attacgtcca gtgtttttgc aaccgttaac gatattgtat agttcattgt 

11061 tatcatctaa acggaataag ttaaaatgtg aacgtaatgc aggcatgcca tataatccat ttaaaacgac 

11131 tctagacaac ataacctcct cattegagca tgggtgctcg ctgatatcat cagcaacgcg atagtcgtaa 

11201 ggtgacgtca taccgaectt gtCttecaac ttaccttgtg tcccaacaaa atagtcctga aaaataatat 

11271 cacgtgcatg aaagtattca cattcatata caacaaacga attaacacgt acatgcacgc aaccaatacc 

11341 cgtaatgtct tgaatcattc ttaatgtact tgtattgata ttaacgtaat cattatcatt aetatagtat 

11411 cttacaatca tttgacgtaa tacacgtgat ttaattttaa tcaataaatc atcgttaaat acatctttat 

114 81 caatcttata taatgaaaaa taattgtcat catctaaaaa agcagggaCt aacgttggtt ctgaatagtg 

11551 ttcgtaaaag tataaccatg ttggaatttt ttcatgatac atcacataag gataactcga atcgatgtca 

11621 acagaaaaac aaggctcatc aattagtttg tttatgtatt tggtgttata catatttaaa ccaccacgat 

11691 agaatgattt aatatagtca taaaaaceca eaccaeggaa atgacaatgt gtataagata ttctaatatc 

11761 ttgatattgg ctgagtaact gaaaacgtgt catctcatca ttcaagtaag attccataat attcaatgaa 

11831 aatgttaate tgttatagtc aaaatttgga aatataccac tataatgaat acggcacaca cctaatataa 

11901 tcacgccatc acgaatgcat gtaagttgtt caggtgtgag tcccgcaaaa catttcacag cacagtcata 

11971 ggcttcacta tcattcatat cattatcttt accaaaaatc gtacaatcaa aatccgtttt aagttgtgat 

12041 tctgttaaat aaccaccatc aagtaatttc ttacctaatg cegcaattga tgtattggte ttcataaagt 

12111 tatcaacaac attaaattta aaaccattca aaaacattgt taaatctaaa ttgattgaag atttaacacg 

12181 tttttccaaa atcacatttt gatcttcggc taaaatagta gcccctttca tttttaatgt gtgttcattt 

12251 tcttctgcag atttcaaata catatttccg cgtgtaatac taccaaaaca acgcatggtg tctttaagta 

12321 aaaaatgatt atcgtattta ttacagttat gtgcaatcat gataatacct gtttttgaet tcgtgactgt 

12391 atcacgtctt tccacatacg CaCaaaatgc gtcataaaaa gatccgaaac tcggaaatac ttcaacatca 

12461 aettcataac cattaaacca accaattgct acagaataag taacgttttt atatttggtt ggtttttttc 

12531 gtccgttaac tccactgcac gctaatgttt ctacatccca gcacaaaatc attcgacgtt catgtttatg 

12601 acaccgcacg cattceagta atcccataat ctcacacacc ttttacaagc catattgtct cattagatac 

12671 cttttcgtat tccctatata gttatcttcg tatatttttt cttctcttcc aaacteactc atatttttct 

12741 tcatttcatt ttttatatga aattttataa teteattcat atctaaatat aaacatctat cattatcaac 

12811 cacgtaaect ttagagtaag cattgtcaaa atgtaaattg cttggattgt agtaataacg ttceatgttt 

12881 tcttcataaa acatatcatc acgtaaacag gcaacatgat cgtctatatc cctaatttta gtacaaaatt 

12951 catattgttc tgtatatggt acaacgataa tatttgtcat aaaagtagtt acatcataca tgactttaat 

13021 acattcacca teagttttga tatagaagaa atcaccgttt tgattgatgt gatttcttaa attatcatcc 

12091 gccaaattat attcgttaaa ttcaaattct ccagttgtca tagcgtcgtc atttgaatta aacgcacgtg 

13161 egttaegttt ttcattcacg taatcgtttc gccgcatttc caaaaaaatg tttttgcaaa gtcttgatgt 

13231 attcatttta tgcttttgta ataaattgta catacttaaa ctggataata taggacttga aaagctgact 

13301 gcattaccta gtaaaaacat cccagggaat ccaatataat caacgttacc atggttacgg tcgattgatt 

13371 catatatcgt tttcaactta tcccactcat caattaaata atcatcttca agcgctaaaa actcatcata 

13441 cacaataata ggacagtget ttaaaaagtt agaatgatat ctcaaatcag tggcactatc caaatctgta 

13511 atcacaccaa tctctttatc ttgatagata atagctaaat agcccctagc acttctgaac gtgacacgtt 

13561 ttgatctaaa tagtggatct tcatctatga ccccctcaac aaaatcacgg taagcgccac gtaatgtata 

13651 atgacgtgat aataaagtaa attttacatc aagttcaaca gccaaacaaa taaaaaatga aacatagttg 

13721 aacgattttc catcagaacg gtttgaaaca gacatataat aacctatatc atcattcaea agttcatcaa 

13791 ctaattccat ttgattatac ttatctggga ccctctctcc gacaegattg acagcatttt gataatctct 

13861 taccatgtct aaacgatcct gctctaccac gcctttgctc cctgcaatag ttcatgatgt cgttcacagt 

13931 gttaaattta ttcgtcaaat gttgcataat acaaaaagtc acacctcaca tcttcaccat caatatttgt 

14001 cactggtcta tctgatttac caatctcctc atataaagta tcgatctctt taatatattt atacattgaa 

14071 gaattatcat ttttagcttg taaattatat aaagcgtatc catgcttttt agcgttttta ttattagaat 

14141 catcattaeg gttatacact tcaagaatat aacccaaccc cteaegtctt gaacctctta ccaatgatac 

14211 agcatttaca tatgatacgt ttctttctct aggaaaatag ggcagacgcg caaaatgctt ccatgtgtca 

14281 atgtacgcct cttgtaaacc tttatcatca aacttaaaat caacaccacc aaaatcaccc aaaaataaat 

14351 ctctctcccg cccttttcta gcttctcttt ctcccttcca tccatccatc tcagacgtafc gtctaaccaa 

14421 tgttatcaac ctccacacaa agcataaata accactaaaa agataatata gaatataatc aacgtagtga 

14491 ataaaacacc aaatgaeacg cgtatatgca gtgccacaag tacgataagt gcaattaaaa acgccaaaag 

14561 gaaaacaatg gctatgttta acaggttatt cacggtcaac cactteccca ttatcgtaea tgaccctgtc 

14631 ttgataaata atcattaatt cgctttcaag aggtttatca aaacctgaca atacgtcgtc aactgtaacg 

14701 tttaataaaa cctctctcat taatccatta cccaaacaat ctctataaca aaatacaagt atattaaaaa 

14 771 catgttttcc aatatcaatg tcgatatcta acgtaaacaa ctcccttcca atttcaaaac caccatatcg 

14 841 tttgtcaaac tcaatataca catcacccat actcattttc accatacacc ctccactaga cgaagtaaat ^. 

14 911 etctcaaate catcaccata acaacctcca ctcgctaaaa ggcaataaat caaattattt aatctaaasg- 

14 981 tagttttaat tttcattttt atatctcctt aatgtactct acgatatacg cgtattcctt agtgaacagg 

15051 ccacactcac aacatgaata tacaactcca gcgccaeaca aacccccaaa caccgagacc tgatgtggaa 

15121 aacgtcctct aatctcatcg caatataaca ataccgcctc gcacctacge teeatttaaa cacctcataa 

15191 aaaacagggg ataagtatcc cctatgaaat tgtattaaaa cgacacctga ccaaaattga tcgagcaacc 

15261 tttttgacct cttttgtcct catattcaca aattgtgaac cgaacttctc cagcattgat aatgtcaaca 

15331 acgtcctcat ctgccctcac tcctttaact aattctgtta agcggttcgg taagtctacg ctacagtcat 
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154 01 cagtgacgat aacacettgt tcaccgaatt 

1S471 ttttttcaca ccgtattctt ctactaatte 

15541 aatctcgcta atgtgttttg gcgtcttgat 

15611 ttaaateatt tgctctctgc aaccgcgatt 

15681 tgcgcgtagt ggacaacagt ccacatgtgt 

15751 ctcgtgaagt ggcaaaaatt cctcaatgta 

15821 acacgcaagg caacaacgcc gtcaaccttc 

15891 cgttccataa aaccctttat gcatattcca 

15961 gatcctggtt cagtttcgtt gcttagttca 

16031 atagttgttg gcaagccgac aataagttaa 

16101 tttattgaat agetgcaaca tttcagtaca 

16171 attattatca cttcctaata aagttgaaat 

16241 tcaacgccaa catcataaaa tgaaattcca 

16311 tcttaaaacg aaaaacatgc ttcaactcaa 

16381 tgatcacata cttagtacag caaacgtcca 

16451 tttcaaaact actatttaat agaagaaata 

16521 agatacacaa attttgtact tgatgaatat 

16591 ttctaagctt agtaaagaaa tgataagtaa 

16661 ggtggggt 
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ctgattcttc gtttgtgaat aatgctctaa cgacatactc 
tgatagtttg ataaattctc CttcctCttc ctcaaactca 
aaaatatcct ctacgcttgc cattttatct ctcctcttac 
cgtagtaaac cattgtaata aacttgaatt gctttcgttg 
ctggtaataa ttcttttgct tgtgcttcgg ttaaatgata 
ttcattatca tcaectaagt aatgaagtat ataacctttg 
attactatat caccccttcc taaaaaacgc aaacgttata 
ctgttctatt gggtcatcac cagcaacata agacaatatc 
tcatttaaga atcgaacaac agaactatta tagtttaata 
ttgcactgtc aaatgtataa gctggattcc attgaatcag 
ggcctgtcct ttttcttctg gtgcattatc aacattaacc 
cacgcgtaaa acagaattat gatttaaatc ttcaatttca 
ttttctgttc catcaaacaa cgctatacat aaacttccat 
tgttttttgt ttcatttccc atttetgtta cecettgttt 
aaagttttgt caacagtttt tcttaaaaaa gtttaaataa 
agatcctaag ttcaaatcat aattttgaat aaaagtcaat 
gtaataggtt agataagtcg gttaagttgt tgcacagtat 
atttataagt cttgatttgt ataatcgttt atttcaaacc 
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Table 17 



Phage 44 AH J D ORFs list 



nb 


Name 


Frame 


Position 


Size (a.a.| 


Keywords 


1 


44AHJDORF001 


-1 


10342.. 12627 


761 


DNA polymerase; 


2 


44AHJDORF002 


3 


37S9..5732 


647 


Techoic add; Staph; 


3 


44AHJDORF003 


2 


6626.-8389 


587 


Tail: 


4 


44AHJDORF004 


1 


8764.. 10227 


487 


Serine protease motif; 


5 


44AHJDORF005 


•1 


12643..13890 


415 




6 


44AHJDORF006 


2 


803..2029 


408 




7 


44AHJOORF007 


1 


2044..3027 


327 


Upper collar 


8 


44AHJDORF008 


2 


3020..3775 


251 


Lower collar; 


9 


44AHJOORF009 


2 


S744..6496 


250 


Amidase; Staph; 


10 


44AHJDORF010 


-2 


13938.. 14420 


160 




11 


44AHJDORF012 


3 


8391.881 3 


140 


Holin; 


12 


44AHJOORF013 


•2 


14586..14996 


136 




13 


44AHJDORF113 


1 


199.. 600 


133 




14 


44AHJDORF011 


-2 


15225.. 15593 


122 




15 


44AHJDORF114 


-2 


15870.. 161 72 


100 




16 


44AHJDORF014 


3 


6243..6521 


92 




17 


44AHJDORF015 


1 


15403.. 15645 


60 




18 


44AHJDORF016 


.1 


1561 6.. 15852 


78 




19 


44AHJDORF017 


-2 


10536.. 10757 


73 




20 


44AHJDORF018 


-1 


686..1096 


70 




21 


44AHJOORF019 


-2 


9630..9836 


68 




22 


44AHJOORF121 


-1 


16165.. 16362 


65 




23 


44AHJOORF020 


2 


13885..14053 


62 




24 


44AHJDORF123 


2 


614..796 


60 




25 


44AHJDORF021 


-2 


5634..5816 


60 




26 


44AHJDORF023 


-2 


6315..6494 


59 




27 


44AHJDORF024 


1 


14275..14451 


58 




28 


44AHJDORF025 


•3 


14999..15175 


58 




29 


44AHJOORF026 


-3 


14426.. 14593 


55 




30 


44AHJDORF027 


1 


12916..13060 


54 




31 


44AKJDORF029 


-1 


15019..15183 


54 




32 


44AHJDORF028 


•3 


9071 ..9235 


54 




33 


44AHJDORF030 


3 


14487..14648 


53 




34 


44AHJDORF031 


2 


11039..11191 


50 




35 


44AHJDORF135 


3 


693-642 


49 




36 


44AHJDORF033 


-1 


3646.-3795 


49 




37 


44AHJDORF032 


-2 


9306..9455 


49 




38 


44AHJDORF034 


-3 


14000..14146 


48 




39 


44AHJDORF035 


-3 


13811..13957 


48 




40 


44AHJDORF036 


-3 


10019..10165 


46 




41 


44AHJOORF022 


-3 


8468..861 1 


47 




42 


44AHJDORF037 


1 


14788..14931 


47 




43 


44AHJDORF038 


-2 


3526.-367 1 


47 




44 


44AHJDORF039 


3 


1743..1883 


46 




45 


44AHJ0ORF040 


2 


9740..9877 


45 




46 


44AHJDORF041 


2 


15836.. 15973 


45 




47 


44AHJDORF042 


-1 


5014..5151 


45 




48 


44AHJDORF043 


-1 


4402..4539 


45 




49 


44AHJDORF044 


-2 


12783..12917 


44 




50 


44AHJDORF149 


-2 


639..770 


43 




51 


44AHJOORF046 


1 


4891 ..501 9 


42 




52 


44AHJDORF047 


1 


11911. .12039 


42 




53 


44AHJDORF045 


2 


10655..10783 


42 




54 


44AHJDORF048 


-3 


15212..15340 


42 




55 


44AHJDORF049 


3 


5784..5909 


41 




56 


44AHJDORF050 


3 


13158..13283 


41 




57 


44AHJDORF051 


-2 


10944..11066 


40 




58 


44AHJDORF052 


-3 


14216..14338 


40 




59 


44AHJDORF053 


3 


334S..3467 


39 




60 


44AHJDORF054 


3 


7551 ..7670 


39 




61 


44AHJDORF055 


3 


15705..15821 


38 




62 


44AHJDORF056 


1 


5512..5625 


37 




63 


44AHJDORF057 


2 


10121..10231 


36 




64 


44AHJDORF058 


3 


10767.. 10877 


36 
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65 


44AHJDORF164 


-1 


592..702 


36 




66 


44AHJDORF059 


-2 


8250..8360 


36 




67 


44AHJDORF060 


-2 


6147..6257 


36 




68 


44AHJDORF061 


2 


15551. .15653 


35 




69 


44AHJDORF062 


1 


42B5..4369 


34 




70 


44AHJDORF063 


-3 


9383.-9487 


* 




71 


44ANJOORF065 


• 1 


5029..5130 


33 




72 


44AHJDORF064 


2 


2609..2710 


33 




73 


44AHJDORF066 


-2 


10380..10481 


33 
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Table 18 

Predicted amino acid sequences 

4 4AHJDORF001 

12627 acggqattactagaatgcatgcaatatcataaacatgaacgtcgaatgattttatactgggatatagaaacattagcgtacaat 

1 H C LLECMQYHKHERRMI LYWDI BTLAYN 

12543 aaagttaacggacgaaaaaaaccaaccaaatataaaaacgttacttattctgtagcaattggttggtttaatggttatgaaatt 

29 KVHORKKPTKYKNVTYSVAIGWPNGYBI 

12459 gatgt tgaagt at 1 t ccgagt 1 1 cgaat ct 1 1 1 1 atgacgcat 1 1 1 at acgt atgt gaaaagacgt ga t acaat cacaaaat ca 

57 D^EVPPSPESFYDAFYTYVKRRDTITKS 

12375 aaaacagatattatcatgattgcacacaaccgtaacaaatacgacaatcattttttactcaaagacaccatgcgttattttgat 

a5 -KTDIIMIAHNCNKYDHHFLLKDTMRYFD 

12291 aatattacacgcgaaaatatatatttaaaatctgcagaagaaaatgaacacacattaaaaatgaaagaggctactattttagcc 

113 NITRBN1YLKSAEENEHTLKMKSATILA 

12207 aaaaatcaaaatgtaattttagaaaaacgtgttaaatcttcaatcaatttagatttaacaatgtttttaaatggttttaaattt 

141 KNQNVILBKRVKSSINLDLTHPLMGPKF 

12123 aatattattgataaccttatgaaaaccaatacatcaattgcaacattaggtaagaaatcacttgatggtggtcatttaacagaa 

169 NIIDNFMKTNTSIATLGKKLLDQGYLTE 

12039 tcacaacttaaaacagattttaattatacgatttttgataaagataatgatatgaatgatagtgaagcctatgactatgctgtg 

197 SQLKTDPNYTIFDKDNDMNDSEAYDYAV 

11955 aaacgttttgcaaaactcacacccgaacaacctacatacattcataatgacgtgactatattaggtatgtgccatattcattat 

225 KCFAKLTPEQLTYIHNDVIILGMCHIHY 

11871 agtgatatatttccaaattttgactataacaaattaacattttcattgaatattatggaatcttacttgaataatgaaatgaca 

253 SDIFPNFDYNKLTFSLN1MESYLNNEMT 

11787 cqttttcagttactcaaccaatatcaagacattaaaatatctcatacacattatcatttccatgatatgaattcttatgactat 

281 R FQ LLNQYQDIKISYTHYHFHDMNPYDY 

11703 attaaatcattctaccgtggtggtttaaacatgtataacaccaaatacataaacaaactaattgatgagccttgtttttctatt 

309 IKSFYRGGLNMYNTKYINKLIDEPCFSI 

11619 gaca tcaat tcgagt t at cct t atgcgatgt at catgaaaaaat t cc aaca tggt t at ac 1 1 1 1 acgaacact at t cagaacca 

337 DINSSYPYVMYHEKI PTW LYFYBHYSBP 

11535 acgttaatccctacttttttagatgatgacaattatttttcattatataagattgataaagatgtatttaacgatgatttatta 

365 TLIPTPLDDDNYFSLYKIDKDVFNDDLL 

11451 attaaaattaaatcacgtgtattacgtcaaatgattgtaaaatactataataatgataatgattacgttaatatcaatacaaat 

393 IKIKSRVLRQMIVKYYNNDNDYVMIHTM 

11367 acattaagaatgattcaagacattacgggtattgattgcatgcatatacgtgttaattcgtttgttataeatgaatgtgaatac 

421 TLRMIQDITGIDCMHIRVNSPVIYECEY 

11283 1 1 1 catgcacot gatat t at 1 1 1 1 caaaactat 1 1 1 at t aaaacacaaggt aagt t aaaaaacaaaat caatat gaca tcacct 

449 PHAR DIIFQNYFIKTQGKLKNKINMTSP 

11199 tacgactatcacattactgatgatatcaacgaacacccatactcaaatgaggaggttatgttatctaaagtcgttttaaatgga 

477 YDYHITDDINEHPYSNEEVMLSKVVliHG 

11115 ttatatggcatacctgcattacgttcacattttaacttattccgtttagatgataacaatgaactatacaatatcattaacggt 

SOS lygIPALRSHFNLFRLDDMNELYNIING 

11031 tacaaaaacactgaacgtaatatattattctctacatttgtcacatcacgttcattgtataacttattggttcctttccaatac 

533 YKNTERNILFSTFVTSRSLYNLLVPFQY 

10947 ttaacggaaaqtgaaattgacgacaattttatttattgcgatactgatagtttgtatatgaaatccgttgttaaacccttattg 

561 LT BS BIDDNFIYCDTDSLYMKSVVKPLL 

10863 aaccccagtttattcgacccgatagccttaggtaaatgggatattgaaaacgaacagatagataagatgtttgtactgaatcat 

589 kpsLPDPIALGKWDIENEOIDKMFVLNH 

10779 aagaaatacgcatatgaagtgaatggaaagattaaaattgcttctgctggtataccgaaaaacgcctttgatacaagcgtcgat 

617 K KYAYBVMGKIKIASAG1PKNAFDTSVD 

10695 tttgaaacctttgtacgtgaacaattctttgacggtgccattattgaaaacaataaaagtatctataatgagcaaggtacaata 

645 FBTFVRBQFFDGAI IENNKS IYNEQGTI 

10611 tcgatatatccgtctaaaactgaaattgtatgtggtaatgtatatgatgaatattttactgatgaacttaatatgaaacgtgaa 

673 SIYPSKTEIVCGHVYDEYFTDELNMKRE 
10527 tttatattaaaagacgctagagaaaaettcgaccatagtcaatttgatgatattctttatattgaaagtgacatcggttcattt 
701 PILKDARBNFDHSQFDDILYIESDIGSF 
10443 tcacttaacgacttatttccagttgaacgttcagtacataacaaatctgatttgcatatattaaaacgtgaacatgatgaaata 
729 SLMDLFPVERSVHNKSDLHILKREHDEI 
103S9 aaaaaaggeaac tgt t aa 10342 
7S7 K K G M C • 
44AHJDORF002 

3789 atggcatataatgaaaacgattttaaatattttgacgacattcgtccatttttagacgaaatttataaaacgagagaacgttat 
1 MAYNENDFKYFDDIRPFLDEIYKTRERY 
3873 aeaccgttttacgatgatagagcagattataatactaattcaaaatcatattatgattatatttcaagattateaaaactaaft 
29 T ppYDDRADYNTNSKSYY OYISRLSJS_lT-l 

39S7 qaagtattagcacgtcgtatttgggactatgacaatgaattaaaaaaacgtttcaaaaattgggacgactta*tgaaagcattt 
57 ivLARRIWDYDNELKKRFKNWDDLMKAF 
4041 ccagagcaagcgaaagacttatttagaggttggttaaacgacggtacgattgacagtattattcatgacgagtttaaaaaatat 
85 PEQAKDLFRGWLHDGTIDS I IHDEFKKY 

4125 agcgcaggat t aacat cggcat t tgc 1 1 ta 1 1 1 aaagt t actgaaatgaaacaaatgaatgact 1 1 aaat cagaagt t aaagac 
113 S AG LTSAFALFKVTEMKQMNDFKSEVKD 
4209 ctaattaaagatattgaccgcttcgttaatgggtttgaattaaatgagcttgaaccaaagtttgtgatgggctttggtggtatt 
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141 LIKDIORFVNGFELNELEPKFVMGPGGI 

4293 cgcaacgcagttaaccaatctattaatatcgataaagaaacaaatcacatgtactctacacaatccgattctcaaaaacctgaa 

169 RNAVNQS1NIDKETNHMYSTQSDSQKPE 

4377 ggtttttggataaataaattaacacctagtggtgacttaatttcaagcatgcgtattgtacagggtggtcatggtacaacaatc 

197 GPWINKLTPSGOLISSMRIVQGGHGTTI 

4461 ggattagaacgtcaatccaatggtgaaatgaaaatctggttacatcacgacggtgttgcaaaactgttaeaagtcgcatataaa 

22S GLERQSNGEMKIWLHHDGVAKLLQVAYK 

454 5 gataattatgtattagatttagaagaggctaaaggtttaacagattatacaccacagtcacttttaaacaaacacacattttaca 

253 DNYVLDLEEAKGLTDYTPQSLLNKHTFT 

4629 ccgttaattgatgaagcaaatgacaaactcattttaagattcggtgacggaacaatacaggttcgttcaagagcagacgtaaaa 

281 PLIDEANDKLIliRFGDGTIQVRSRADVK 

4713 aat c a cat t gat aatgc agaaaaagaaac gacaat t gat aat t cagaaaacaa tgataat cgt tggatgcaaggca t tgctgt t 

309 N H IDNVEREMT I D NS EHNDNRWMQG I AV 

4 797 gatggtgatgatttatactggttaagtggtaacagttcagttaattcacatgttcaaatcggtaaatattcattaacaacaggt 

337 DGDDLYWLSGNSSVNSHVQIGKYSLTTG 

4881 caaaagatttatgattatccatttaagttatcatatcaagacggtattaatttcccacgtgataactttaaagagcctgagggt 

36S QKIYDYPPKLSYQDGINFPRDNFKEPEG 

496S atttgcatttatacaaatccaaaaacaaaacgeaaatcgteattacetgctatgacaaaeggcggeggtggaaaacgttcccat 

393 ICIYTNPKTKRKSLLLAMTNGGGGKRFH 

504 9 aatttatatggtttcttccaacttggtgagtatgaacactttgaagcattacgcgcaagaggttcacaaaactataaattaaca 
421 NLYGFFQLGfiYEHPEALRARGSQNYKLT 
S133 aaagacgacggtcgtgcattatctattccagaccatatcgacgatttaaatgacttaacgcaagctggtttttattatattgac 

449 KDDGRALSI PDHIDDL.NDLTQAGFYYID 

5217 gggggtactgcagaaaaacttaagaatacgccaatgaatggtagcaagcgtataattgacgctggttgtttcattaatgtatac 

477 GGTAEKLKNMPMNGSKRIIDAGCFINVY 

5301 cctacaacacaaacattaggtacggttcaagaattaacacgtttctcaacaggtcgtaaaatggttaaaatggtgcgtggtatg 

505 PTTCTLGTVQELTRFSTGRKMVKMVRGM 
S3 85 actttagacgtatttacgttaaaatgggattatggattatggacaacaatcaaaactgacgcaccatatcaagaatatttggaa 
533 TLDVFTLKWDYGLWTTIKTDAPYQEYLE 
5469 gcaagtcaataeaataactggattgcttatgtaacaacagctggtgagtattacatcacaggtaaccaaatggaattatttaga 
561 ASQYNNWIAYVTTAGEYYITGNQMBLFR 
5553 gacgcgccagaagaaattaaaaaagtgggtgcatggttacgtgtgtcaagtggtaacgcagtcggtgaagtaagacaaacatta 
589 DAPEEIKKVGAWLRVSSGNAVGEVRQTL 
5637 gaggc t aat at atcggaat ataaagaat t ct t cagt aatgt t aat gcggaaacaaaacat cgt gaat atggtt gggt agcaaaa 
617 EANISEYK BFFSNVNAETKHRBYGWVAK 
5721 catcaaaaatag S732 

645 H Q K * 
44AHJDORF003 

6626 atgagaaagttaacgaattttaagtttttctataacacaccgtttacagactatcaaaacacgactcattttaatagtaataaa 

1 MRKLTNFKFFYNTPFTDYQNTIHFWSNK 

6710 gaacgtgacgattattttttaaatggtcgtcattttaaatcgttagactattcaaaacaaccgtataattttatacgtgataga 

29 ERDDYFLNGRHFKSLDYSKQPYNFIRDR 

6794 atggaaatcaatgttgatatgeagtggcatgacgcacaaggtattaaetacatgacgtttttatcagattttgaggatagaaga 

57 MEINVDMQWHDAQGIHYMTFLSDPEDRR 

6878 tattacgcttttgtaaaccaaatcgaatacgtgaatgacgttgtggttaaaatatattttgtcattgataccattatgacgtat 

85 YYAFVNQIEYVMDVVVKIYFVIDTIMTY 

6962 acacaagggaatgtattagagcaacectcaaacgtcaataecgaacgtcaacatttatcaaaacgcacgtataactatatgtta 

113 TQGNVLEQLSNVN IERQHLSKRTYHYML 

7046 ccaatgttacgtaataatgatgatgcgttaaaagcatcaaacaaaaactatgteeacaaccaaatgcaacaaeacttggaaaat 

141 PMLRNNDDVLKVSNKNYVYNQMQQYLBN 

7130 ctagtattattccagtcaagcgctgatttatcaaagaaatttggtactaaaaaagagccaaacttagatacgtcaaaaggtacg 

169 LVLFQSSADLSKKPGTKXBPNLDTSKGT 

7214 atttatgacaatatcacatcaccagtcaacttatacgttatggaatatggtgactttattaactttatggataaaatgagtgcc 

197 IYDNITSPVNLYVMEYGDPINFMDKMSA 

7298 tatccatggatcacgcaaaactttcaaaaggttcaaatgttacctaaagactttattaatacaaaagacttagaggacgttaaa 

225 YPWITQNFQKVQMLPKOFINTKDLBDVK 

7382 accagtgaaaaaattacaggattaaaaacattaaaacagggtggtaaatcaaaagaatggagtctaaaagatttatcattaagt 

253 TSEJCITGLKTLKQGGKSKBWSLXDLSLS 

7466 ttctcaaatettcaagagatgatgttatctaaaaaagatgaatttaaacatatgatacgtaatgagtatatgacaattgaattt 

281 FSNLQEMMLSKKDEFKHMIRNEYMTIEF 

7550 t atgact ggaatggaaatacgatgt t act cgacgct ggt aagat 1 1 cacaaaaaact ggtgt t aagt t acgt acaaaat caat t 

309 YDWNGNTMLLDAGKISQKTGVKLRTKSI 

7634 attggttatcataatgaagttcgagtatatccagtagattataacagtgctgaaaacgacagaccaatactcgctaaaaataaa 

337 IGYHNEVRVYPVDYNSAENDRPILAKHK 

7718 gaaatattgattgatacgggtccattcttaaatacaaatataacatttaatagttttgcacaagtaccaatattaatcaataat 

365 BILIDTGSFLNTNITFNSFAQVPILIMN 

7802 ggtatcttaggacaatcacaacaagccaaccgacaaaaaaatgcagaaagtcaattaattacaaatcgtattgataatgtatta 

393 GI LGQSQQANRQKNAESQLITNRIDUVL-. 

7886 aatggtagcgacccgaaatcacgcttttatgacgctgtgagtgtagcaagtaatttaagtccaactgctttatttggtalgttt 

421 NGSDPKSRFYDAVSVASNLSPTALF-GKP 

7970 aatgaagaatataatttctacaaacaacaacaagctgaatataaagatttagccttacaaccaccttctgtaactgaatcagaa 

449 NEEYNFYKQQQABYKDLALQPPSVTESE 

8054 atgggcaacgcattccaaattgcgaatagcattaacggtttaacgatgaaaattagtgtaccgtcacctaaagaaattacattt 

477 MGNAFQIANSINGLTMKISVPSPKEITF 

8138 ttacaaaaatattacatgttgcttggttttgaagtgaatgactataattcatttattgaaccaattaacagtatgactgtttgc 
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SOS LQKYYMLFGFEVNDYNSFIEPINSMTVC 

8222 aattatttaaaatgtacaggtacgtatactatacgtgacatcgaccccatgctaatggaacaattaaaagcaattttagaatct 

533 NYLKCTGTYTIRDIDPMLMEQLKAILES 

8306 ggtgtaagattttggcataatgacggttcaggtaatccaatgttacaaaatccateaaataacaaatttagagagggggtataa 
8389 

561 QVRFWHNDGSGNPMLQK PIiMMKFRBGV* 
44AHJDORF004 

8764 atgatactgaaaagagtgataacaatgaacgateaagagaagatagataaatttacgcattcctatattaatgatgattttggt 

1 MILKRVITMNOQEKIDKFTHSYINDDFG 

8848 teaacgatagaccagttagtccctaaagtaaaaggatatgggcgctttaatgtatggcttggtggtaatgaaagtaaaatcaga 

29 LTIDQLVPKVKGYGRFHVWLGGNBSKIR 

8932 caagtattaaaagcagtaaaagagataggtgtetcacctactetettegccgtatacgaaaaaaatgagggttttagttctgga 

57 QVLKAVKEIGVSPTLFAVYEKMEGPSSG 

9016 ct tggt tggt t aaacca t acge ccgcacgtggtgat t at t taacagatgctaaa 1 1 cat agcaagaaagt t age at cacaatca 

85 LGWLNHTSARGDYLTDAKFIARKLVSQS 

9100 aaacaagctggacaaccgtcttggtatgacgcaggtaacatcgtccactttgtaccacaagacgtacaaagaaaaggtaatgca 

113 KQAGQPSWYDAGNIVHFVPQDVQRKGNA 

9184 gattttgcaaaaaatatgaaagcaggtacaattggacgtgcatatattccattaacagcagctgctacttgggcggcatattat 

141 DFAKNMKAGTIGRAYIPLTAAATWAAYY 

9268 cctttaggtttgaaagcatcatataacaaagtacaaaactatggtaatccatttttagacggtgcgaacaccattctagcttgg 

169 PLGLKASYNKVQNYGNPFLDGANTILAW 

9352 ggtggtaaattagacggtaaaggtggatcacctagtgattcgtctgacagtggtagtagtggtgacagtggtagttcactactc 

197 GGKLDGKGGSPSDSSDSGSSGDSGSSLL 

9436 gctttagcaaaacaagccatgcaagaattattaaaaaaaatacaagacgcattacaatgggacgttcatagtattggtagtgat 

22S ALAKQAMQELLKKIQDALQWDVHSIGSD 

9520 aaattttttagtaatgattattttaeatcagaaaaaacatttaacaacacatatcatattaaaatgacgattggtttacttgat 

253 KPFSNDYPTLEKTFNNTYHIKMTIGLLD 

9604 tcattaaaaaaactgattgatagcgttcaagtagatagtgggagtagtagttctaatcctactgatgatgacggagaccataaa 

281 SLKKLIDSVQVDSGSSSSNPTDDDGDHK 

9688 ccaattagtggtaaatcagtcaagccaaatggaaaaagtggtcgtgtgattggtggtaactggacaeatgcacagttaccagaa 

309 PISGKSVKPNGKSGRVIGGNWTYAQLPB 

9772 aaatataaaaaagcaattggtgtacctttattcaaaaaagaatacttatacaaaccaggtaacatatttcctcaaacgggtaat 

337 KYKKAIGVPLFKKEYLYKPONIFPQTGN 

9856 gcaggacaatgtacagaattaacatgggcgtatatgtcacaactacatggtaaaagacaacctaccgacgacggtcaaataaca 

365 AGQCTBLTWAYMSQLHGKRQPTDDGQIT 

9940 aacggtcagcgtgtatggtacgtctataaaaagttaggtgcaaaaacaacacataatccaaeagtaggttatggtttctctagt 

393 NGQRVWYVYKKLGAKTTHNPTVGYGFSS 

10024 aaaecaccatacttacaagcaactgcatatggtattggtcacacaggtgttgttgtagcagtttttgaagatggttcgttttta 

421 KPPYLQATAYGIGHTGVVVAVFEDGSPL 

10108 gttgcaaactataatgtaccaccatatgttgcaccatcacgtgtggtattgtacacactcattaatggcgtaccaaataatgct 

449 VANYNVPPYVAPSRVVLYTLIMGVPNNA 

10192 ggtgataatattgtattctttagtggtattgcttaa 10227 

477 GDNIVFFSCIA* 
44AHJDORF005 

13890 atggt aaaacaaaat cgt 1 1 agacat ggt aagagat t at caaaatgc tgtcaat cat gt cagaaaaaaaat cccagat aagt at 

1 MVKQMRLDMVRDYQNAVNHVRKKIPDKY 

13806 aatcaaatagaattagttgatgaacttatgaatgatgatatagattattatatatctatttcaaaccgttctgatggaaaatcg 

29 NQIBLVDELMNDDIDYYISISMRSDGKS 

13722 ttcaactatgtttcatettttatttatctagctattaaacttgatataaaatttactttattatcacgtcattatacattacgt 

57 FNYVS FFIYLAIKLDIKFTLLSRHYTLR 

13638 gacgcttaccgtgattttattgaagaaatcatagatgaaaatccactatttaaatcaaaacgtgtcacgttcagaagtgctagg 

85 DAYRDFIEEI 2DBN P LPKS KRVTFRSAR 

1 3 S 5 4 gact at t eagct at t at c t at caagataaagaaat tggtgtgat t acagat t tgaat agtgccac tga 1 1 taaaat at cat t ct 

113 DYLAI IYQDKBIGV1TOLHSATDLKYHS 

13470 aactttttaaaacactatcctattattatatatgatgagtttttagcacttgaagatgattatttaattgatgagtgggataag 

141 SFLKHYPII1YDEPLALEDDYL1DBWDK 

13386 ttaaaaacaatatatgaatcaatcgaccgtaaccatggtaacgttgattatattggattccctaaaatgtttttactaggtaat 

169 LKT1Y8SIDRNHGNVDYIGFPKM7LLGN 

13302 gcagteaacttttcaagtcctatattatccaatttaaatatatacaatttattacaaaagcataaaatgaatacatcaagactt 

197 AVNPSSPILSNLMIYMLLQKH KMSTSRL 

13 2 18 t acaaaaacat 1 1 1 1 1 1 agaaatgegacgaaacgat t aegt g aa tgaaaaacgt aacacacgtgcgt 1 1 aat tcaaatgacgac 

225 YKNIFLBMRRNDYVNEKRNTRAFNSNDD 

13134 gctatgacaactggagaatttgaatttaacgaatataatttggcggatgataatttaagaaatcacatcaatcaaaacggtgat 

253 AMTTGBPEFNEYNLADDNLRNHINQNGD 

13050 ttcttctatatcaaaactgatgataaatatattaaagtcatgtataatgtaactacttttatgacaaatattatcgttgtacca 

281 FFYIKTDDKYIKVMYNVTTPMTNI1VVP 

12966 tatacaaaacaatatgaattttgtactaaaattagggatatagacaatcatgttacctatttacgcgatgatatgttttataaa 

309 YTKQYEFCTKIRDIDMHVTYLRDDMFYK 

12882 gaaaacatggaacgttattactacaatccaagcaatttacattttgacaatgcttactctaaaaattacgtggttgataatgat 

337 8MMBR YYYNPSWLKFDWAYSKWYVVDWO 

12798 agatatttatatttagatatgaataaaattataaaatttcatataaaaaatgaaatgaagaaaaatatgagtgagtttgaaaga 

365 RYLYLDMNKI IKFH I KNEMKKNMSEFER 

12714 aaagaaaaaatatacgaagataactatacagagaatacgaaaaagtatctaatgaaacaataeggcctataa 22643 

393 KBKI YBDMY1ENTKKYLMKQYGL* 
44AHJDORF006 
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803 atggcacaacaatctacaaaaaacgaaactgcacttttagcagcaaagccagctaaaccagcgttacaagattttaatcatgat 

1 MAQQSTKNETAIiLVAKSAKSALQDPNHO 

867 tattcaaaatcttggacacttggcgacaaatgggataattcaa&tacaatgttcgaaacatttgtaaataaatatttattccct 

29 YSKSWTFGDXWDHSNTMFETFVNKYLFP 

971 aagattaatgagactttattaatcgacattgcattaggtaatcgttttaattggttagctaaagagcaagattttattggacaa 

57 KINBTLLIDIALGNRFNWLAKEQDFIGQ 

105S tatagtgaagaatacgtgattatggacacagtaccaactaacatggacttatctaaaaatgaggaattaatgttgaaacgtaat 

85 YS EEYVIMDTVPINMDLSKNEE LMLKRN 

1139 tatccacgtatggcaactaagttatatggcaacggaattgtgaagaaacaaaaattcacattaaacaacaatgatacacgtttc 

113 YPRMATKLYGNGIVKKQKFTLNNNDTRF 

1223 aatttccaaacattagcagacgcaactaactacgctttaggtgtatacaaaaagaaaatttctgatattaatgtattagaagaa 

141 MFQTI.AOATMYALGVYKKKI S D INVX.EE 

1307 aaagaaatgcgtgcaatgttagttgaetactcattgaaccaattatccgaaacaaatgtacgtaaagcaaeatcaaaagaagat 

169 KEMRAMLVDYSLNQLSETNVRKATSKED 

1391 ctagcaagcaaagtttttgaagcaaccctaaacccacaaaacaacagtgccaaacacaatgaagcacaccgtgcatcaggcggc 

197 LASKVFBAILNLQNHSAKYNEVHRASGG 

1475 gcaattggacaatatacaactgtatcaaaattaaaagatattgtgattetaacaacagattcattaaaatcttatcttttagat 

225 AIGQYTTVSKLKDZVILTTDSLKSYLLD 

1559 act aagac tgcaaacacat t ecagat t gcaggcac tgat 1 1 cacagat cacgt t att agt 1 1 1 gacgac t t aggt ggcgtgt tt 

253 TKIANTPQIAGIDFTDHVIS PDDLGGVF 

1643 aaagtaacaaaagaatttaagttacaaaaccaagattcaattgactttttacgtgcgcatggagattatcaatcacaattagga 

281 KVTKEFKLQNQDSIDFLRAYGDYQSQLG 

1727 gatacaattccagttggtgotgcatttacttatgatgtatctaaacttaaagagtttactggcaacgttgaagaaattaaacca 

309 DTIPVGAVFTYDVSKLKEPTGNVEEIKP 

1811 aaaccagatttatatgcgtttattttggatattaattcaattaaatacaaacgttacacaaaaggtatgttaaaaccaccattc 

337 KSDLYAFILDINSIKYKRYTXGMLKPPF 

1895 cataaccctgaattcgatgaagctacacaccggattcaetactactcatttaaagccattagcccatcccttaataaaacttta 

365 HM PEFDEVTHWI HY YS F K A I S PFFNKI L 

1979 attactgaccaagacgcaaatccaaaaccagaggaagaattacaagaataa 2029 

393 ITDQDVNPKPEEELQE* 
44AHJDORFQ07 

2044 atgaacaacgataaaagaggtttaaacgttgagttatcaaaggaaatcagcaaaagagttgttgaacatcgcaacagatttaaa 

1 MNNDKRGLHVELSKEISKRVVEHRNRFK 

2128 c^cttatgtttaatcgttatttggaatttctaccgctactaatcaactataccaatcgtgatacggttggtatagactttatt 

29 RLMFNRYLEFLPLLIKYTNRDTVGIDFI 

2212 cagttagaatcagctttaagacaaaacattaatgtagttgttggtgaagetagaAataagcaaattatgattcttggttatgta 

57 QLESALRQNIMVVVGEARMKQIMILGYV 

2296 aataacacttactttaatcaagcaccaaaccttccatcaaacttcaatttccaattccaaaaacgactaactaaagaagatata 

85 NNTYFNQAPNFSSNPNPQPQKRLTKBD .I 

2380 tattttattgtacctgactatttaatacctgatgattgtctacaaattcataagctatatgataactgtatgagtggtaacttt 

113 YFIVPDYLI PDDCLQIHKLYDMCMSGMF 

2464 gttgtcatgcaaaataaaccaattcaatataatagtgatatagaaattatagaacattatactgatgaattagcagaagttgct 

141 VVMQNKPIQYNSOIEIIBHYTDELASVA 

2548 ttatctcgcttttctttaatcatgcaagcaaaatttagcaagatattcaaaccagaaattaatgacgagtcaatcaatcaactt 

169 LSRFSLIMQAKFSKIFKSEINDBSINQL 

2632 gtgtccgaaacatataacggcgcaccacctgccaaaacgtcacctatgtttaatgcagacgacgatatcattgattcaacaagt 

197 VSEIYMGAPFVKMSPMPNADDDIIDLTS 

2716 aatagcgtaatcccagcattaactgaaatgaaacgggaatatcaaaacaaaattagtgaattaagtaactatttaggcattaat 

225 tfSVIPALTEMKREYQNKISELSNYLGXtf 

2800 ccattagccgtcgacaaagaaagcggcgtttcagacgaagaggcaaaaagtaatcgtggatttaccacatcaaacagtaacacc 

253 SLAVDXESGVSDBEAKSNRGFTTSNSMI 

2834 tatttaaaaggtcgtgaaccaattacgtttttatcaaagcgttatggtttagatattaaaccgtactacgatgacgaaacaacg 

281 YLKGRBPITFLSKRYGLDI KPYYDDBTT 

2968 tctaaaatatcaatggtagacacactttttaaagatgaaagcagtgatataaatggctag 3027 

309 SKISMVDTLFKDESSDING* 
44AHJDORP008 

3020 atggctagatacacaatgactttatacgatttcattaaatcagaattgattaaaaaaggtttcaatgaatttgtaaatgataat 

1 MARYTMTLYDF1KSBLIKKGPNEFVNDH 

3104 aaattaacgttttatgatgatgaatttcaactcatgcaaaaaatgctgaagttcgacaaagacgttttagctatcgttaatgaa 

29 xltfyddefqpmq'xmlkpdkdvlaivne 

3188 aaagcactcaaaggtttctcattgaaagatgaattatcagatttactttttaaaaaatcatttacgattcattttttagataga 

57 xvfxgfslkdelsdllfkxsftihfldr 

3272 gaaatcaacagacaaacagttgaagcatttggcatgcaagtgattactgtatgtattacacatgaggattatttaaatgtggtc 

B5 bimrqtveapgmqvitvcithedylnvv 

3356 cat t cat caagtgaagt cgaaaaatac 1 1 acaatcacaaggc 1 1 cacagaacacaatgaagac acaacaagt aacactgat gaa 

113 ysssevbxylqsqgftehnedttsntde 

3440 acatcgaatcaaaatgctacatctttagacaattcaactggcatgactgcaaacagaaacgcttatgtgtcattaccacaaagt 

141 tsnqnatsldnstgmtahrnayvslpqs 

3524 gaggttaacatcgatgttgataatacaacgtcacgattcgccgataataatacgattgataacggtaaaactgtgaataaaccg. 

169 EVNIDVDNT'TLRFADNNTIDNGKfv K-_X~'S 

3608 agtaacgaaagtaatcaaaacgcaaaacgtaatcaaaatcaaaaaggtaatgcaaaaggtacacaattcactaagcagtatcta 

197 SNESNQNAXRNQNQXGNAXGTQFTRQYL 

3692 attgacaacattgataaagcgracgatctaagaaagaaaattttaaatgaatttgataaaaaatgttttttacaaatctggcag 
3775 

225 IDNIDXAYDLRXXILNEFDXXCFLQIW- 
44AHJDORF009 
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5744 atgaaatcacaacaacaagcaaaagaatggatatataagcatgagggggcaggtgttgactctgatggtgcatatggatttcaa 

1 MKSQQQAKEWI YKHEGAGVDFDGAYGFQ 

582 B tgtatggacttatcagttgcttatgtgtattacattactgacggtaaagttcgcatgtggggtaatgctaaagacgcgataaat 

29 CMDLSVAYVYYITDGKVRMWGNAKDAIN 

5912 aatgactttaaaggtttagcgacggtgtataaaaatacaccgagctttaaaccteaattaggggacgttgctgtatatacaaat 

57 N D FKGLATVYKNTPS FKPQLGDVAVYTN 

5996 ggacaacatggacatactcaatgtgtgttaagtggaaatcttgattattatacatgcttagaacaaaactggttaggcggcggt 

85 GQYGHIQCVLSGNLDYYTCLEQNWLGGG 

6080 tttgacggttgggaaaaagcaaccattagaacacattattatgacggtgtaactcactttattagacctaaattttcaggtagt 

113 FDGWEKATIRTHYYDGVTHPIRPKFSGS 

6164 aatagcaaagcattagaaacatcaaaagtaaatacatttggaaaatggaaacgaaaccaatacggcacatattatagaaatgaa 

141 NSKALETSKVNTFGKWKRNQYGTYYRNE 

624 8 aatggtacatttacatgtggttttttaecaatatttgeacgtgtcggtagtccaaaattatcagaacctaatggctattggttc 

169 NGTFTCGFLPIFARVGSPKLSEPNGYWF 

6332 caaccaaacggttatacaccatataacgaagtttgtttatcagatggttacgtatggattggttataactggcaaggcacacgt 

197 QPNGYTPYNEVCLSDGYVWIGYNWQGTR 

6416 t at t a 1 1 1 ac cagt gcgccaatggaatggaaaaacaggt aat agt tacagt gt t ggt at t cct t ggggggt gttctcataa 6496 

225 YYLPVRQWNGKTGNSYSVGI PWGVFS* 
44AHJDORP010 

14420 ttggttagacatacgtctgaaatggatagatggaaaaaagaaagagaagctagaaaagagcaagaaaaagatttatttttaaat 

1 LVRHTSEMDRWKKEREARKEQEKDLFLN 

14336 gattttagtaatgttaattttaaatttgatgataaagatttacaagaggcgtacattgacacatggaaacattttgcacatctg 

29 DFSNVNFKFDDKDLQEAYIDTWKHFAHL 

14252 ccctattttcctaaagaaagaaacgtatcatatgtaaatgctgtatcattggtaagaggttcaagacataaaaaattaaattat 

57 PYFPKERN VSYVNAVSLVRGSRHKKLNY 

14168 at t c 1 1 gaaat at a t aaccgt aatgat gat t ctaataat aaaaacgct aaaaagcataaatacgc 1 1 1 atataat t tacaagct 

85 ILEIYNRNDDSNNKNAKKHKYALYNLQA 

14084 aaaaataataattcttcaatgtataaatatattaaagaaatcgatactttatataaagaaattggtaaatcagatagaccagtg 

113 KNNNSSMYKYIKEIDTLYKEIGKSDRPV 

14000 acaaatattgatgatgaagatgtgaggtataactttttatattatgcaacatttgacgaataa 13938 

141 TM1DDEDVRYNFLYYATPDB* 
44AHJDORF011 

15593 atgacaaacgt aaaagat at 1 1 ta t caagacaccaaaacacat t agcgagat t tgaat t tgaggaaaaagaaagagaat 1 1 at c 

1 MTNVKDILSRHQNTLARFEFBEKERBFI 

15509 aaactatcagaattagtagaaaaatacggtatgaaaaaagagtatatcgttagagcattattcacaaacaaagaatcaaaattc 

29 KLSELVEKYGMKKEYIVRALPTNXBSKF 

15425 ggtgaacaaggtgttatcgtcactgatgactataacgtaaacttaccgaaccacttaacagaattaattaaagaaatgagagca 

57 GEQGVIVTDOYNVNLPNHLTBLIKBMRA 

1S341 gatgaggacgttgttgacattateaatgctggagaagttcaattcacaatttatgaatatgaaaacaaaaaaggtcaaaaaggt 

65 DEDVVDI INAGEVQFT I YBYBNKKGQKG 

15257 tactcaatcaattttggtcaagtatcattttaa 15225 

113 YSINFGQVSP* 
44AHJDORF012 

8391 atgaacgaagtaaaattcagatttacagactcagaagcgtttcacatgtttatatacgctggggatttaaaattactctacttt 

1 MNEVKFRFTDSEAFHMPIYAGDLKLLYF 

8475 ttatttgtattaatgttcgttgatattattacaggtatttcaaaagcaattaaaaataataacttatggtcaaaaaaatcaatg 

29 LFVLMFVDI ITGISKAI KNNHLWSKKSH 

6S59 agaggattttctaaaaaattattgatattctgtattatcattttagcaaacatcattgaccagattttacaattaaaaggtggt 

57 RGFSKKLLIFCIIILANI IDQILQLKGG 

864 3 ct act cat gat t acaatat 1 1 1 at t at at tgcaaatgagggact 1 1 ct a 1 1 gt agaaaat t gt gcagaaatggacg t a 1 1 agt a 

65 LLMITIFYYIANEGLS IVBNCAEMDVLV 

8727 ccagaacaaattaaagataaattaagagtcattaaaaatgatactgaaaagagtgataacaatgaacgatcaagagaagataga 

113 PEQIKDKLRV'IKNDTEKSDNNERSREDR 

8811 taa 8813 
141 

44AHJDORF013 

14996 atgaaaattaaaactacttttagattaaataatttaatttattaccttttaacaaatagagattattataatgataaatttgaa 

1 M K I KTTFRLNNLIYYLLTNRDYYNDKFE 

14912 aaat t tact t cat c taat aaaa&atgt atagcaaaaat aaat atgggt gat gt gt at at tgagt t tgacaaacaat atgat gat 

29 KFTSSNKKCIVKINMGDVYI EFDKQYDD 

14828 tttgaaattgaaaaagagttatttacgttagatatcgacattgatattaaaaaacatgtttttaatatacttgtattttattat 

57 PBI E KELFTLDIDID I K K H V P N I L V F Y Y 

14744 agaaattatttaagtaatgaattaataagagaaattttattaaacgttacaattgacgacgtattatcaaattttgataaacct 

85 RNYLSNELIREILLNVTIDDVL.SNFDKP 

14660 cttgaaagcgaattaatgattatttatcaaaacaaagtcatatacgataatgggaaagtgattgaccatgaataa 14586 

113 LESELMI IYQNKVIYDNGKVIDHE* 
44AHJDORF113 

199 atgacagaatttgatgaaatcgtaaaaccagacgacaaagaagaaacttcagaatcaactgaagaaaatttagaatcaactgaa 

1 MTEFDE IVKPDDKEETSESTEE'NLES TE.. 

283 gaaac 1 1 cagaat caact gaagaat caactgaagaat caact gaagaat caact gaagataaaacag t agaaacaatcgSagaa 

29 ETSESTEESTEESTEESTEDKTVE T— I E E 

367 gaaaatgaaaacaaattagaacctactacaacagatgaagatagttcgaaattcgaccctgttgtattagaacaacgtactgct 

57 ENENKLEPTTTDEDSSKFDPVVLEQRIA 

451 tcaccagaacaacaagtgactacttttttatcttcacaaatgcaacaaccacaacaagtacaacaaacacaatcagatgtaaca 

85 SLEQQVTTFLSSQMQQPQQVQQTQSDVT 

535 gaat caaacaaagaagat aacgac t at t cagatgaagaactagt t ga t aagt t agat 1 1 aga 1 1 ag 600 
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113 ESNKEDNDYSDBELVDKLDLD* 
44AHJDORP1H 

16172 atggttaatgttgataatgcaccagaagaaaaaggacaagcctatactgaaacgttgcaactactcaacaaactgattcaatgg 

1 MVNVDNAPEEKGQAYTEMIiQLFNKLIQW 

16088 aacccagctcacacatccgacaatgcaattaacteattatcggcttgccaacaactactatcaaaccataacagttctgttgtt 

29 NPAYTFDNAINLLSACQQLLLNYNSSVV 

16004 caattcttaaacgatgaaccaaacaacgaaactaaaccagaaccaacatcgccttacatcgctggcgatgacccaatagaacaa 

57 QPLNDELNNETKPESILSYIAGDDPIEQ 

15920 tggaacaegcataaaggattttatgaaacgcataacgtttacgttttteag 15870 

85 WNMHKGFYETYNVYVF* 
44AHJDORF014 

6243 atgaaaatggcacatttacatgcggtttcttaccaacacctgcacgcgccggtagtccaaaatcatcagaacctaatggctatt 

1 MKMVHLHVVFYQYLHVSVVQNYQNLHAI 

6327 ggtcccaaccaaacggttatacaccatataacgaagtttgtttatcagatggttacgcatggactggttataactggcaaggca 

29 GSMCTVIHHZTKFVYQMVTYGLVITGKA 

6411 cacgtcactatttaccagtgcgccaatggaatggaaaaacaggtaatagttacagtgttggtattccttggggggtgttctcat 

57 HVIIYQCAMGMEKQVIVTVLVFLGGCSH 

6495 aatgggtattfctagcctttttctttga 6521 

85 NGYPSLFL* 
44AHJDORF01S 

15403 gtgacgacaacaccttgttcaccgaatttcgattctttgtttgtgaataacgctctaacgatatacccttttttcataccgtat 

1 VTITPCSPNFDSLFVHNALTIYSPFIPY 

15487 ttttccactaattctgatagtttgataaattctctctctttttcctcaaattcaaatctcgctaatgtgttctggtgtcctgac 

29 FSTHSDSLINSLSFSSNSMLAMVPWCLD 

15571 aaaatatcttttacgtttgtcattttattccccctcttatttaaattatttgctttccgcaattgcgacttgtag 15645 

57 KISFTFVILFLLLFKLFAFCNCDL* 
44AHJDORF016 

15852 atgaaagttgacgacattgttaccttacgtgtcaaaggtcatatacttcattacttagatgatgataatgaatacactgaggaa 

1 MKVDDIVTLRVKGYILHYLDDDNEYIBE 

15768 ttt tt accact t cacgagt atcat 1 1 aaccaaaacacaagcaaaagaatt att accagacacat g t aaac t at t g t ccactaca 

29 FLPLHEYHLTKTQAKELLPDTCKLtSTT 

15684 cgcaeaacgaaaacaatteaagtttattacaatgatttactacaaatcgcaattgcagaaagcaaataa 1S6X6 

57 RTTKTIQVYYNDLLQIA1AESK* 
44AHJDORF017 

1075? acggaaagactaaaattgcttctgctggtataccgaaaaacgcctttgatacaagcgtcgattttgaaacctttgtacgtgaac 

1 MBRLKLLLLVYRKTPLIQASILKPLYVH 

10673 aattctttgacggtgccattattgaaaacaataaaagtatctataatgagcaaggtacaatatcgacatatcegtctaaaactg 

29 HSLTVPLLKTIKVSIMSKVQYRYIRLKL 

105B9 aaattgtatgtggtaatgtatatgatgaatattttactgatgaacttaatatga 10536 

57 KLYVVMYMKNILIiMMLI * 
44AHJDORP01B 

1098 atgttaattggtactgtgtccataaccacgtattcttcactatattgtccaacaaaatcttgctctttagctaaccaattaaaa 

1 MLrGTVSIITYSSLYCPIKSCSLAWQLK 

1014 cgattacctaatgcaatatcgactaataaagtctcattaatcttagggaataaatatctatttacaaatgtttcgaacattgta 

29 RLPNAISINKVSLILGMKYLFTNVSMIV 

930 ttegaattaecceatttgtcgccaaatgtccaagaeettgaaeaa 886 

57 PELSHLSPMVQDFB- 
44AHJDORF019 

9836 atgccacctggttcgtataagtattctcctttgaacaaaggtacaccaattgctttcttatattttcctggtaactgtgcatat 

1 MLPGLYKYSFLNKGTPIAFLYFSGNCAY 

9752 gtccagttaccaccaatcacacgaccaccttttccatttggcttgactgatttaccactaattggtttatggtctccgtcatca 

29 VQLPPITRPLFPFGLTDliPLIGLMS PSS 

9668 tcagtaggattagaactactactcccactatctacttga 9630 

57 SVGLSLLLPLST* 

44AKJDORP121 

16362 atggaaaatgaaacaaaaaacattgagttgaagcatgtttttegttttaagaatggaagtttatgcatagcgtcacttgataga 

1 MSHSTKNIELKHVFRFKNGSLCIALFDR 

16278 acagaaaatgaaatttcattttatgatgttgacaccgacgaaattgaagacttaaatcacaattctgtcttacgcgtaatttca 

29 TBNBISPYDVDIDEIBDLNHNSVLRVIS 

16194 actttattaggaagtgataataatggttaa 16165 

57 TLLGSDNHG* 
44ABJOORT020 

13865 atgcctaaacgattttgtcttaccatgtttttgctccttgtaacagttcatgatgtcgtctacagtgctaaatttatccgtcaa 

1 MSKRFCFTMFLLLVIVYDVVYSVKFIRQ 

13949 atgttgcataatataaaaagttatacctcacatcttcatcatcaatacttgtcactggtctatctgatttaccaatttctttat 

29 MLHNIKSYTSHLHKQYLSLVYLI YQFLY 

14033 ataaagtatcgatttctttaa 14053 

57 IKYRFL* "" " 

44AHJDORF123 — ** 

614 atgtacgagggaaacaacatgcgttctatgatgggcacatcacatgaagattcaagaccaaacaaacgaacagaactaaatgaa 

1 MYBGHNMRSMMGTSYEDSRLNKRTEIiNE 

698 aacacgteaattgatacaaacaaaagtgaagatagttatggtgtacaaattcactcactttcaaaacaatcacttacaggtgac 

29 NHSIDTNKSEDSYGVQZHSLSKQSFTGD 

782 gttgaggaggaataa 796 

57 V E E E • 
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4 4AHJDORF021 

5816 atgcaccatcaaagtcaacacctgcccccccacgcctatacatccattcctttgcctgttgctgtgacctcacttatatcactc 
1 MHHQSQHLPPHAYISILLLVVVISFISL 
5732 ccattctcgacgttttgctacccaaccatattcacgatgctttgtttccgcattaacatcaccgaagaattctccatattccga 
29 LFLMF.CYPTI FTMFCFRINITEEFFIPR 

S648 cacattagcctctaa 5634 
57 Y I S L * 

4 4AHJDORF022 

8611 atgtttgctaaaatgataatacagaataecaataattttttagaaaatcctctcattgatttttttgaccataagttattattt 
1 MFAKMI IQHIMNFLEMPLIDPFDHKLLF 

8527 ttaattgcttttgaaacacctgtaacaatatcaacgaacattaatacaaataaaaagtag 8468 
29 LIAFEIPVIISTNINTNKK* 
44AHJDORP023 

6494 atgagaacaccccccaaggaataccaacactgtaaccattacctgtttttccattccatcggcgcactggtaaataataacgtg 
1 MRTPPKEYQHCNYYLFFHS IGALVNNNV 

6410 tgcctcgccagctataaccaacccacacgtaaccatctgacaaacaaacttcgttatatggcgtataaccgtttggttggaacc 
29 CLASYNQSIRNHLINKLRYMVYNRLVGT 
6326 aatagccattag 631S 
57 N S H * 

44AHJDORF024 

14275 gtgccaatgcacgcctcttgtaaatctttatcatcaaatttaaaattaacattactaaaatcatttaaaaataaatctttttct 
1 VSMYASCKSLSSNLKLTLLKSPKNKSFS 
14359 cgctcttttctagctcctccttcttttttccatctatccatttcagacgtatgtctaaccaatgttaccaacctccatacaaag 
29 CSFLASLSFFHLSISDVCLTNVIKLHIK 
14443 cataaataa 14451 
57 H K * 

44AHJDORF025 

15175 atggaacgtaaatacaaaacggtattattatattgcgatgagattaaaggacattttccacatcaaatctcaaegtttgaagat 
1 MERKYKTVLLYCDEIKGHFPHQISMPBD 
15091 ttatatgacgctaaagttgtatattcatattatgaaeataacctgttcactaaaaaatacgcgtataccatagaatacattaag 
29 LYDAKVVYSYYEYNLPTKKYAYIIEYIK 
15007 gagatataa 14999 
57 EI* 
44AHJDORF026 

14593 atgaacaacctattaaacatagccattgttttccttctagcatttttaactacacttatcatactcatgacactgcatatacgc 
1 MNNLLHIAIVFLLAFLITLIILMTLHIR 
14509 gtgtcatttggtgttttattcactacattgattatattctatattatctttttaatggttattcatgcttcatatggaggttga 
14426 

29 VSFGVLFTTLI I FYI I FLMVIYALYGG* 

44AHJDORF027 

12916 atgattgtctatatceetaattttagtacaaaattcatattgttttgtatatggtacaacgataatatttgtcataaaagtagt 
1 MIVYIPSFSTKFILFCIWYNDHICHKSS 
13000 tacattacacatgaccttaatatatttatcatcagttttgatatagaagaaatcaccgtcttgattgatgtgatttcttaa 
13080 

29 yiiHDPNIFIlSFDIEEITVLIDVIS* 
44AHJDORP029 

15183 gtgtttaaacggaacgtaaatacaaaacggtattattacattgcgacgagattaaaggacattttccacatcaaatctcaatgt 
1 VFKWNV.HTKRYYYIAMRLKDIPHIKSQC 
15099 ttgaagatttatatgacgccaaagtcgtatattcatattatgaacataacctgttcactaaaaaacacgcgtatatcatag 
15019 

29 LKIYMTLKLYI HIMNITCSLKNTRIS • 



44AKJDORF028 

9235 atggaatatatgcacgtccaatcgtacctgctctcatatttttcgcaaaacctgcattaccttttctttgtacgtcttgtggta 
1 MEYMHVQLYLLSYFLQKLHYLPFVRLVV 
9151 caaagtggacgatgttacctgcgtcacaccaagacggttgtccagcttgttttgattgtgataccaactttcttgctatga 9071 
29 QSGRCYLRHTKTVVQLVLIVILTPLL* 
44AHJSORF030 

14487 gtgaataaaacaccaaatgacacgcgcatatgcagtgtcataagtacgacaagtgtaatcaaaaatgctaaaaggaaaacaatg 
1 VNKTPKDTRICSVISMISVIKNAKRKTM 
14571 gctatgtttaataggttattcatggtcaatcacttccccactatcgtatacgactctgttttgacaaataatcattaa 14646 
29 AMPHRLFMVNHFPIIVYDFVLINNH* 
44AHJDORF031 

11039 atgacattgcacagttcattgtcatcatctaaacggaataagttaaaatgtgaacgtaatgcaggtatgccatataatccattt 
1 MILYSSLLSSKRNKLKCERNAGMPYNPP 
11123 aaaacgactttagataacataaccccctcattcgagtatgggtgttcgttgatatcatcagtaatgtga 11191 
29 KTTLDNITSSFEYGCSLISSVM* 
44AHJDORF135 

693 atgaaaacatgteaattgatacaaataaaagtgaagatagttatggtgtacaaactcattcactttcaaaacaatQfltttracag 
1 MKTCQLIQI KVKIVMVYKFIHFQN H-» H L Q 

777 gcgacgttgaggaggaataataaactatggcacaacaatctacaaaaaatgaaaccgcactttcag 842 
29 VTLRRNNKLWHNNLQKMKLHP* 
44AHJDORP033 

3795 atgccatcatttaaccaqctccaccaaacctgcaaaaaacatttctcaccaaattcatttaaaattctctttcttaaatcgtac 
1 MPLFNHLYQICKKHFLSNSFKIPFLKSY 
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3711 gctttatcaatattatcaattaaatactgcttagtgaattgtgtaccttttgcactaccttcctga 3646 
29 ALSILSIKYCLVNCVPFALPF* 
44AHJDORF032 

9455 atggcttgtcttgccaaagcgagcagtgaactaccaccgtcaccactactaccaccgtcagacgaatcactaggtgatccaccc 
1 MACFAKASSELPLSPLLPLSDESLGDPP 
9371 ttaccgtctaattcaccaccccaagctagaatagcatecgcaccgtceaaaaatggateaccatag 9306 
29 LPSNLPPQAR1VFAPSKNGLP* 
44AHJDORF034 

14146 atgatgatcctaataataaaaacgctaaaaagcataaatacgctttatataatttacaagctaaaaataataattcttcaatgt 
1 NMILI IKTLKSINTLYI IYKLKI IILQC 

14062 ataaatatattaaagaaatcgaeactttatataaagaaattggtaaatcagatagaccagtga 14000 
29 INILKKSILYIKKLVNQIDQ* 
44AHJDORF035 

13957 atgcaacatttgacgaataaatttaacactgtaaacgacatcataaactattacaaggagcaaaaacatggtaaaacaaaatcg 
1 MQHLTNKFMTVNDI INYYKBQKHGKTKS 

13873 tttagacatggtaagagattatcaaaatgctgtcaatcatgtcagaaaaaaaatcccagataa 13811 
29 FRHGKRLSKCCQSCQKKNPR* 
44AHJDORF036 

10165 gtgtatacaataccacacgtgacggtgcaacatatggtggtacattatagcttgcaactaaaaacgaaccatcttcaaaaactg 
1 VYTIPHVMVQHMVVHYSLQLKTNHLQKL 
10081 ctacaacaacacctgtgcgaccaataccatatgcagttgcttgtaagtatggtggtttactag 10019 
29 LQQHLCDQYHMQLLVSMVVY* 
44AHJDORP037 

14788 atgtcgacatctaacgtaaataactcttcttcaatttcaaaatcatcatactgtttgtcaaactcaatatacacatcacccata 
1 MSISHVNNSFSISKSSYCLSNSIYTSPI 
14872 tttatctttactatacattttttattagatgaagtaaatttttcaaattcatcattataa 14931 
29 FIFTIHFLLDEVNFSNLSli* 
44AHJDORF038 

3671 gtgtaccttttgcactacctttttgattttgattacgttttgcgttttgattactttcgccacccgatttattcacagttttac 
1 VYLLHYLFDFDYVLRFDYPRYSIYSQFY 
3587 cgttatcaatcgtattattatcagcgaatcgcaacgttgtattatcaacatcaatgttaa 3528 
29 RYQSYYYQRIVTLYYQHQC* 
44AHJD&RF039 

1743 gtgccgtatttacttatgatgtatctaaacttaaagagtttactggcaacgttgaagaaattaaaccaaaatcagatttatatg 
1 VLYLLMMYLNLKSLLATLKKLHQNQIYM 
1827 cgtttattttggatattaattcaaetaaatataaacgttacacaaaaggtattgttaa 1883 
29 RLFWI LIQLNINVTQKVC* 

44AHJD0RP040 

9740 gtggt aact ggaca t atgcacagt t accagaaaaatac aaaaaagcaat t gg tgtacc 1 1 1 at C caaaaaagaatact t ataca 
! VVTGHKHSYQKMIKKQLVYLYSKKMTYT 
9824 aaccaggtaacatatttcctcaaacgggtaatgcaggacaacgcacagaattaa 9877 
29 NQVTY FLKRVMQDNVQH * 

44AHJDORF041 

15836 atgtcgtcaactttcaccattatatcactcctttctaaaaaacgtaaacgtcatacgtttcataaaatcctttatgcatattcc 
1 NSSTFI I ISLLSKKRKRYTFHKI LYAYS 

15920 attgttctatcgggtcatcaccagcaatataagacaatactgattctggtttag 1S973 
29 IVLLGHHQQYKTILI LV* 

44AHJDORF042 

5151 atgcacgaccgtcgccctttgtcaattcatagttttgcgaaccccctgcgcgtaacgcttcaaagtgttcatactcaccaagtc 
1 MHDRRLLL1YSFVHLLRVMLQSVHTHQV 
5067 ggaagaaaccatataaattatggaaacgttttccaccaccgccgtctgtcatag S014 
29 GRNHIMYGNVFHHRRLS* 
44AHJDORF04 3 

4539 atgcgacctgtaacagttctgcaacaccatcgtgatgcaaccagactttcatttcaccattggatcgacgttctaacccgattg 
1 MRLVTVLQHHRDVTRFSFHHMIDVLIRL 
445S ttgtaccatgaccaccctgtacaatacgcatgcttgaaactaagtcaccactag 4402 
29 LYHDHPVQYACLKLSHH • 

44AHJCGRF044 

12917 atgtcacctatttacgtgatgacacgttttacaaagaaaacatggaacgctattactacaatccaagcaacttacattttgaca 
1 HIiPIYVMICFIKKTWNVITTIQAIYILT 
12933 atgcttactctaaaaaetacgtggttgataatgatagatatttatatttag 12783 
29 NLTLKITWLIMIDIYI • 

44AHJDORF149 

770 atgattgtcttgaaagtgaatgaatttgtacaccataaccatcttcactttcatttgtatcaattgacatgttttcatttaatc 
1 MIVLKVMEFVHHNYLHFYLYQLTCFHLI 
686 ctgttcgtttacttaatcttgaatcttcatatgatgtacccatcatag 639 
29 LFVYLI LNLKMMYPS * 

44AKJDORF046 

4891 atgattatccatctaagttatcatatcaagacggtattaattccccacgtgataacttcaaagagcctgagggtatxcg<?a"ttt 
1 MIIHLSYHIKTVLISHVITLKSLR V— P A F 

4975 atacaaatccaaaaacaaaacgtaaatcgttattacttgctatga 5019 
29 IQIQKQNVNRYYLL* 
44AHJDORF047 

11911 atgaatgcatgtaagttgtccaggcgtgagttctgcaaaacatttcacagcatagtcataggctccactatcattcatatcact 
1 MNVCKLFRCEFCKTFHSIVIGFTI1H1I 
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11995 atctttatcaaaaaccgtacaattaaaatctgttttaagttgtga 12039 
29 IFIKNRIIKICFKL' 
4 4AHJDORF04S 

1065S atggcaccgtcaaagaatcgttcacgtacaaaggtttcaaaaccgacgcttgtatcaaaggcgtttttcggtataccagcagaa 
1 MAPSKNCSRTKVSKSTLVSKAFFGIPAE 
10739 gcaattttaaectttccattcacttcatatgcatatttcttatga 10783 
29 AILIFPFTSYAYFL* 
44AHJDORF048 

15340 atgaggacgttgttgacattaecaatgctggagaagctcaattcacaatttatgaatatgaaaacaaaaaaggtcaaaaaggtt 
1 MRTLLTLSMLEKFNSQFMNMKTKKVKKV 
15256 actcaatcaaetttggtcaagtatcattttaatacaattteaeag 15212 
29 TQS ILVKYHFNTIS* 

44AHJDORF049 

5784 acgagggggcaggtgttgactttgatggtgcatatggacttcaatgtatggacttaccagttgcccatgtgtactacattactg 
1 MRGQVLTLMVHMDFNVNTYQLLMCITLL 
5868 acggtaaagttcgcatgtggggtaatgctaaagacgcgataa 5909 
29 TVKFACGVMLKTR* 
44AHJDORF0S0 

13158 gtgtgttacgtttttcattcacgtaatcgctccgtcgcatctctaaaaaaatgtttttgtaaagtcttgatgcattcattttat 
1 VCYVFHSRNRFVAFLKKCFCKVLMYSFY 
13242 gctttcgtaataaatcgtatatatttaaattggataatatag 13283 
29 APVINCIYLNWII* 
44AHJDORF051 

11066 atgacaacaatgaactatacaacatcaccaacggttacaaaaacactgaacgtaatacattattctctacatttgtcacaccac 
1 M i T MNYTISLTVTKTLNVIYYSI.HLSHH 
10982 gctcattgcacaacttattggttcctttccaatacttaa 10944 
29 VHCITYMFLSNT* 
44AHJDORF052 

14338 atgatttcagtaatgttaattttaaatttgatgataaagacttacaagaggcgcacattgacacatggaaacattttgcacatc 
1 KILVMLILNLMIKIYKRRTLTHGNILHI 
142S4 tgccctattttcctaaagaaagaaacgtatcacatgtaa 14216 
29 CPIFLKKETYKM* 
44AHJDORF053 

3348 atgtggtttattcatcaagtgaagttgaaaaatacttacaaccacaaggctccacagaacacaacgaagatacaacaagcaaca 
1 MWFIHQVKLKNTYNHKASQNTMKIQQVT 
3432 ctgacgaaacatcgaatcaaaatgctacaectttag 3467 
29 LMKHRI KMLHL* 

44AHJDORF054 

7551 atgactggaacggaaatacgatgttactcgacgccggraagatttcacaaaaaactggtgttaagttacgtacaaaatcaatta 
1 MTGMElRCYSTLVRFHKKliVLSYVQNQL 
7635 ttggtCatcataatgaagttcgagtatatccagtag 7670 
29 LV.IIMKFEYIQ* 

44AHJDORF0S5 

15705 atgtgtctggtaacaattcctttgcttgtgttttggccaaatgatactcgtgaagtggtaaaaattcctcaatgtactcattat 
1 MCLVIILLLVFWLNDTREVVKIPQCIHY 
15789 catcatctaagtaatgaagtatataacctttga 15821 
29 HHLSNBVYNL* 
44AHJDORF056 

5512 gcgagtattacatcacaggtaaccaaatggaattattcagagacgcgccagaagaaattaaaaaagtgggtgcatggttacgtg 
1 VSITLQVTKWMYLETRQKKLKXWVHGYV 
5596 tgtcaagtggtaacgcagtcggtgaagtaa 5625 
29 CQVVTQSVK* 
44AHJDORF057 

10121 atgcaccaccatatgtrgcaccatcacgtgcggtattgtacacactcattaatggcgtaccaaataatgctggtgataatattg 
1 MY HHMLHHHVWYCI HSLMAYQIMLVI I L 

10205 tattctttagtggtattgcttaattaa 10231 
29 YSLVVLLN* 
44AHJSORF058 

10767 atgcatatttcttatgattcagtacaaacatcttatctatctgttcgttttcaatatcccatctacctaaggctatcgggtcga 
1 MHISYDSVQTSYLSVRFQYPIYLRLSGR 
10851 aeaaactggggttcaataagggtctaa 10877 
29 IMWGSIRV* 
44AHJDORF164 

702 acgttctcatttaaccctgctcgtttacctaatcctgaacccccatatgatgtacccatcatagaacgcatgttgtttccctca 
1 MFSFNSVRLFNLESSYDVPI IERNLPPS 

618 tacatgtttaaattcctcctaatctaa 592 
29 YMFKFLLI * 

44AHJDORF059 - 

8360 acggattttgtaacattggattacccgaaccgccattatgccaaaatcttacaccagattctaaaattgcctttaatt§ttcca 

1 MDFVTLDYLNRHYAKILHQI LKLL-falVP 

8276 ttaacacggggccgatgccacgtatag 8250 

29 LTWGRCHV* 

44AHJDORF060 

6257 atgtaccattttcatttctataatatgtgccgtattggtttcgttcccattttccaaatgtatttacttctgacgtttctaatg 
1 MYHFHFYNMCRIGFVSIFQMYLLLMFLM 
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6173 ctttgctaetaccacctgaaaatttag 6147 
29 LCYYYLKI* 
4 4AHJDORF061 

1SS51 atgcgttttggtgtcctgataaaacaccctttacgtttgccatcctatttctcctctcatttaaactattcgcctcccgcaatt 
1 MC FGVLIKYLLRLSFYFSSYLNYLLSAI 
15635 gcgatttgtagtaaatcattgtaa 15658 
29 AICSKSL* 
4 4AHJDORP062 

4285 gtggtattcgcaacgcagttaaccaatctattaatattgataaagaaacaaatcacatgtactctacacaatccgattctcaaa 
1 VVFATQLTNLLILI KKQITCTLHNPILK 

436 9 aacctgaaggtttttggataa 4389 
29 N L K V F G * 

44AHJDORF063 

9487 atgcgccttgtattttttttaataattcccgcatggcttgttttgctaaagcgagtagtgaactaccaccgtcaccactactac 
1 MRLVFFLIILAWLVLLKRVVNYHCHHYY 
9403 cactgtcagacgaatcactag 9383 
29 H C Q T N H * 

44AHJDORF065 

502 9 gtggtggaaaacgtttccataatttatatggtctcttccaacttggtgagtatgaacactttgaagcattacgcgcaagaggtt 
1 VVBNVSIIYMVSSNLVSMNTLKHYAQEV 
5113 cacaaaactataaattaa 5130 
29 H K T I N * 

44AHJDORF064 

2609 atgacgagtcaatcaatcaacttgtgtccgaaatatataacggtgcaccatttgttaaaatgrcacctatgtttaatgcagatg 
1 MTSQSINLCPKYITVHHLLKCHLCLMQM 
2693 acgatateategatetaa 2710 
29 T I S L I * 

44AHJOORF066 

10481 atgatattccttatattgaaagtgacatcggttcattttcacttaacgacttatttccagttgaacgttcagcacataacaaat 
1 M I F F I LKVTSVHFHLTTYFQLNVQYITN 

10397 ctgatttgcatatattaa 10380 

29 L I C I Y * 
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Table 19 



Sequence similarities between ORFs 44AHJD and public databases 

Phage: 44AHJD 
Database: nr 

Query* sid| 110871} Ian) 44AHJDORF001 Phage 44AHJD ORF| 10342-12627 | -1 
(761 letters) 

gi|ll8848|sp|P19894|DPOL_BPM2 DNA POLYMERASE >gi | 76896 |pir| |JQ0. . 
gij 1072656 |pir| |SS1275 DMA polymerase - phage CP-1 s-gij 836593 |e. . 
gij 1429230 |emh|CAA67649| (X99260) DNA polymerase [Bacteriophage.. 
gi|l572479|emb|CAA6S7i2| (X969B7) DNA polymerase [Bacteriophage.. 
gijll88Sl|sp|P069S0|DPOL_BPPZA DMA POLYMERASE (EARLY PROTEIN GP. . 
gi 1 2435429 (AF012250) unassigned reading frame (possible DNA po. . 
gi|10B4487jpir) |S41618 DNA polymerase - slime mold {Phyearum po. . 
gi|4877819|gb|AAD31446.l| (AP133505) dna polymerase [Neurospora. . 
gi|461962|sp|P33537|DPOM_NBUCR PROBABLE DNA POLYMBRASB >gi|2833.. 
gi|24995U|sp|Q1247l|6P22_YRAST 6 - PHOSPHOFRUCTO- 2 -KINASE 2 (PHO. . 
gi | 2258375 |gb)AAD11909.1("~(AP007261) transcription initiation f . . 
gi|15734|emb|CAA37450| (X53370) DNA polymerase {AA 1-S75) [Bact.. 

Query- sid| 110872 |lan|44AHJDORF002 Phage 44AHJD ORF|37B9-5732 | 3 
(64 7 letters) 

gi| 133273 lsp|P27622|TAGC_BACSU TBICKOIC ACID BIOSYNTHESIS PROTE . . 
gi 1 142847 (M64050) DNase inhibitor [Bacillus subtills] 
gi|4038407 (AF103943) factor C protein precursor [Streptomyces 

Query- sid| 110B73| lan|44AHJDORF003 Phage 44AHJD ORF| 6626-8389 | 2 
(587 letters) 

gij 138123 |ap|P04331|VG9 BPPH2 TAIL PROTEIN (LATE PROTEIN GP9) >.. 
gi j 138124 | sp j P07534 1 VG9~BPPZA TAIL PROTEIN (LATE PROTEIN GP9) >. . 
gi|l429238[«nb|CAA67657T (X99260) tail protein [Bacteriophages.. 
gi|215339 (M12456) p9 tail protein [Bacteriophage phi-29] >gi[2.. 
gij 1181968 |emb|CAA87738.1| (Z47794) tail protein (Bacteriophage., 
gi j 1181970 jembi CAA0774O.il (Z47794) tail protein [Bacteriophage.. 

Query- sid[ 110875) lan|44AHJDORP005 Phage 44AHJD ORF| 12643 -13890] -1 
(415 letters) 

gij 3845203 (AE001399) GAP domain protein (cyclic nt signal traa. . . 
gi | 3758843 | emb| CAB11128 - 1 | (Z9BS51) predicted using hexExon; MA... 
gij 3845297 (AE001421) hypothetical protein [Plasmodium falciparum) 
gij 4493936 | emblCAB38972.il (AL034556) predicted using hexExon; ... 
gij 3845165 (AE001390) hypothetical protein [Plasmodium falciparum] 

Query- sid|110877 | lan| 44AHJDORF007 Phage 44AHJD ORF| 2044-3027 | 1 
(327 letters) 

giJU81960(emb)CAA87731.l| (Z47794) connector protein [Bacterio. . . 
gill429239|emb|CAA67658| (X99260) upper collar protein [Bacteri... 
gijl37915|ap|P07535|VG10 BPPZA UPPER COLLAR PROTEIN (CONNECTOR ... 
gi|137914|ep|P04332|VG10~BPPH2 UPPER COLLAR PROTEIN (CONNECTOR ... 

Query- sid| 110878 | lanl44AHJDORF008 Phage 44AHJD ORF| 3020-3775 | 2 
(251 letters) 

gi (4982468 |gb|AAD309«3.2| (AF1181S1) SNPl/AMP-activated kinase ... 
gij 1730077 jap jpl8160|KYKlJ>ICDI NON - RECEPTOR TYROSINE KINASE SP. ; . 
gi|3758855|emb|CA811140.lX (298551) predicted using hexExon,- MA... 
gi | S85795 | Sp | P21538)RBB1__YEAST DNA-BINDING PROTEIN REB1 (QBP) >... 
gij 172372 (M58728) DMA-binding protein (saccharomycea cereviaiae) 
gij 2952545 (AP051B9B) coronin binding protein [Dictyostelium di... 
gi|53S260)emb|CAAB2996| (Z30339) STARP antigen (Plasmodium reic .. . 
gij 1429240) emb | CAA67659 | (X99260) lower collar protein [Bacteri... 



5S 
53 
49 
46 
45 
45 
45 
44 
44 
41 
40 
39 



92 
82 
78 
71 
54 
42 



52 
49 
48 
47 
46 



46 
45 
44 
41 



S2 
46 
46 
46 
46 
45 
45 
44 



le-os 
6e-06 
le-04 

o.ooi 

0.002 
0.002 
0.002 
0.004 
0.004 
0.041 
0.070 
0.092 



112 76-24 
52 le-05 
39 0.10 



88-18 
le-14 
2e-13 

2B-11 
3e-06 
0.010 



6e-06 
5e-05 
le-04 
2e-04 
6e-04 



Se-04 
8e-04 
0.002 
0.009 



3e-06 

2e-04 

2e-04 

3e-04 

3e-04 

6e-04 . 

7e-04 

0.001 



WO 00/32825 



PCT/IB99/02040 



283 



Query- aid j 110879 | lan | 44AHJDORF009 Phage 44AHJD ORF} 5744-6496 | 2 
(2S0 letters) 

gi|2 76498l|embjCAA69021.l| (Y07739) N-acetylrauratnoyl-L- alanine ... 
gi|H3675|8p|P24556|ALYS STAAU AUTOLYSIH IN-ACETYLMURAMOYL-L-AL. . . 
gi | 1763243 (U72397) amidase [bacteriophage 80 alpha} 
gi|4574237|gb|AAD23962.l|AF106851_l {AP10685D LytN [Staphyloco. . . 
gi 13 767593 j do j |BAA338S6.1| (AB015195) LytN [Staphylococcus aureus I 
gi j 2764983 jemb|CAA69022 . 1 j (Y07740) cell wall hydrolase Plyl87 ... 
gij 3287732 jap |0051S6|ALE1_STACP GLYCYL- GLYCINE ENDOPEPTIDASE AL. . . 
gi|79926|pir) (A25881 lysostaphin precursor - Staphylococcus sim. . . 
gi 1 126496 | sp | P1054 8 | LSTP_STAST LYSOSTAPHIN PRECURSOR (GLYCYL-GL . . . 
gij 3287967 |sp|P10S47|LSTP STASI LYSOSTAPHIN PRECURSOR ( GLYCYL- G. . . 
gi|3341932|dbj |BAA31898.1| (A8009B66> amidase ( pep tidogl yean hy. . . 

Query- aid | 110882 J lan j 44AHJDORP012 Phage 44AHJD ORF| 8391-8813 j 3 
(140 letters) 

gi|l40528|sp|P2481l|YQXH BACSU HYPOTHETICAL 15.7 KD PROTEIN IN . 
gi|412663l|dbj|BAA36651.T| (AB016282) ORF45 (bacteriophage phi-, 
gi j 141088 | ap | P26835 | YNGD CLOPB HYPOTHETICAL 14 . 9 KD PROTEIN IN . 
gi|2293160 (AF008220) YtkC (Bacillus subtilis] >gi | 2635S48 | erabj . 
gi | 1181973 | emb | CAA87743.lt U47794) holin protein [Bacteriophag. 



180 
118 
118 
84 
84 
77 
73 
69 
69 
69 
68 



80 
76 
61 
36 
31 



le-44 

5e-26 
6e-26 
9e-16 
9C-16 
2e-13 
2e-12 
3e-ll 
3«-ll 
3e-ll 
6e-ll 



6e-l5 
le-13 
4e-09 
0.099 
3.3 
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Table 20 

Homolgies between phage 44 AHJD ORFs and proteins in public databases 



Query- pt| 110971 44AHJDORF001 Phage 44AHJD ORF 1 10342-12627 | - 1 1 
(761 letters) 

>gi| 118848 |sp|P19894|DPOL_BPM2 DMA POLYMERASE >gi | 76896 | pir | j JQ0161 
DMA- directed DMA polymerase (EC 2.7.7.7) - phage M2 
>gi | 215509 (M33144) DNA polymerase (Bacteriophage N2] 
Length - 572 

Score - 55.4 bits (131), Expect » le-06 

Identities - 96/426 (22%), Positives - 159/426 (36%), Gaps - 88/426 (20%) 

KLTPEQLTYIHNDVIII^MCHIHYSDIFPNFDYNKLTFSLNIMESYLNNEMTR FQ 283 

++TPE+ YI ND+ 1+ DI +++T ♦ +♦ ♦ + T+ F 



L+ D +1 + YRGG N KY K I B D+MS YP 

KLSLPMDKEI RKAYRGGFTWLNDKYKEKEIGEGMV- FDVNSLYP 252 



MY +P YP+ +D+LYI+P+L K+ + 
QMYSRPLP YGAPI VFQGKYEKDEQYPLY- IQRIRFEFEL KEGYIPTI 299 



Query: 


229 


Sbjct: 


154 


Query: 


284 


SbjCt: 


210 


Query: 


344 


SbjCt: 


253 


Query: 


404 


Sbjct: 


300 


Query: 


463 


SbjCt: 


353 


Query: 


512 


SbjCt: 


404 


Query: 


571 


SbjCt: 


4S0 


Query: 


626 


Sbjct: 


509 



XXXXXXXXXXXXXXXXXUWIQ-DITGIOOTHIRVMSFVIYECEYFHARDIIFQNYFIK 462 

♦ ++ +T +D X+ + + +Y BY F + 

IKKNPFFXGNBYLKNSGVEPVELYLTNVDLEHQEH-YELYHVEYIDGFK FRB 352 

QGKLKNKINKTS PYDYHI TDDINEHPYSNEEVMLSKVVLNGLYG I PAL 511 

G K+ 1+ + H + L+K++LN LYG +P L 

TGLFKDFIDKWTYVKTH BEGAKKQLAKLMLNSLYGKFASNPDVTGKVPYL 403 



+ +L FR+ D YK+ + F+T+ + + + Q D 
KDDGSLGFRVGDEB YKDPVYTPM-GVFITAWARFTTITAAQACY DRI 449 

IYCmT3SLYMKSVVKPI^PSLFDPIAIiGKWDIENEQlDKMFVLNHKK YAYBVKG 625 

IYCDTDS+++ P++DPLGWB+ + L K Y EV+G 



K+K S 



>gi | 1072656 | pir j |S51275 DMA polymerase - phage CP-1 

>gi|836593lemb|CAA8772S.l| (247794) DMA polymerase 
(Bacteriophage CP-1] 
Length ■ 568 

Score e 53.5 bits (126). Expect » 6e-06 

Identities « 104/464 (22%) , Positives - 169/464 (36%) , Gape - 66/464 (14%) 

Query: 230 LTPEQLTYIHMDVI IL- '-GMCTIHYSDIFPNFDYNKLTFSLNIMBSYLNNEMTRFQLLNQ 287 

+ PB + YIH DV IL G+ ++Y ♦ P Y + +L + +F+ 
Sbjct: 152 IKPBWIDYIHVDVAILARGIFAMYYBEHFTK- - YTSASEALTEFKRIFRKSKRKFRDFPP 209 

Query: 288 YQDIKISYTHYHFHDMMFYDYIKSFYRGGLNMYNTKYINKLIOEPCFSIDIMSSYPYVW 347 

D K+ D+ + G ♦ K+ + +♦+ DINS YP M 

SbjCt: 210 ILDBXVD DFCRKHIVGAGRLPTLKHRGRTLNQLIDIYDIMSMYPATML 2S7 

Query: 348 HEKI PTWLYFYEHYSEPTLI PTFLDDDNYFSLY- KIDKDVFNDDL- LIKIKSRVLRQMXX 405 

+P + ♦ Y P + +D+Y+ + K D D+ L I+IK 

Sbjct: 258 QNALPIGXP- -KRYXGK PKEIKEDHYYIYHIKADFDLKRGYLPTIQIKKKLDALRIG 312 

Query: 406 XXXXXXXXXXXXXXXXLRMIQDITGIDCMHIRVNSFVIYECEYFHARDIIFQNYFIKTQG 465 

L+ +H +EF +F+Y 
Sbjct: 313 VRTSDYVTTSiaJEVIDLYLraFDLDLFLKHYDATIMYVETLE - FQTBSDLFDDYI 366 
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Query: 


A C C 

H DO 


Sbjct: 


367 


Query: 


524 


Sbjct: 


413 


Query: 


S78 


Sbjct: 


463 


Query: 


638 


Sbjct: 


51S 



+ y y 

-TTYRYK- 



£+ S E +K++LN LYG ♦ S L LDD 
- KENAQS PAS KQKAXI MLN S LYG KFG AK I 1 S VKKLA YLDDK 412 



SYKNTERNIL FSTFVTSRSLYNLLVPFQYLTESEIDDNFIYCDTDS 577 

+KN + * + FVTS + + ++ Q E DNF+Y DTDS 

-FKNDDEEEVQPVYAPVALFVTSIARHPI ISNAQ ENYDNFLYADTDS 4 62 



L++ 



A T 



DP GKW E + K 



E F GA E ++ 



+G 



K Y E+ 



IY + +1 



>gi| 1429230 |emb|CAA67649| 
B103] 

Length - 572 



(X99260) DMA polymerase {Bacteriophage 



Score - 49.2 bits (US), Expect - le-04 

Identities « 93/422 (22%), Positives = 155/422 (36%), Gaps 



88/422 (20%) 



Query: 229 KLTPEQLTYIHOTVIII^CKIHYSDIFPNFDYNKLTFSLNIMESYLNNEKTR FQ 283 

++TPE+ YI ND+ I* DI +++T + + + T+ F 

Sbjct: 154 EITPEEYBYIXNDIEI I AHA LDIQFKQGLDRMTAGSDSLKGFKDIL5TKKFNKVFP 209 

Query: 2 84 LLNQYQDIKISYTHYHFHDMNFYDYIKSFYRGGLNMYKTKYINKLIDEPCPSIDINSSYP 343 

L+ D +1 + YRGG N KY K I E D+NS YP 

Sbjct: 210 KLSLPMDKSI RRAYRGGFTWLNDKYKEKEIGEGMV- FDVNSLYP 252 

Query: 344 YVMYHEKIPTWLYFYEHYSEPTL1OTFLDDDNYFSLYX1DI03VFND 403 

HY+P YP+ + D ♦ LY I + P +L K + + 
Sbjct: 2S3 SQHYSRPLP YGAPIVFQGKYEKDEQYPLY- IQRIRPEFBL KEGYIPTI 299 



Query: 404 XXXXXXXXXXX} 



XXXLRMI Q- DITGIDCMHIRVNSFVI YECEYFHARDI 1 FQNYPTK 462 
++ +T +D 1+ ♦ + +Y EY F + 

Sbjct: 300 QIKKNPPFKGNEYIJCNSGAEPVELYLTNVDLELIQEH-YEMYNVEYIDGFK FSE 352 



Query: 463 TQGKLKNKINMTSPYDYHITDDINEHPYSNEEVMLSKWLNGLYG I PAL 511 

G K 1+ ♦ H + L+K++ + LYG +P L 

Sbjct: 353 KTGLFKEFXDKWTYVKTH EKGAKKQLAKLHFDSLYGKFASNPDVTG1CVPYL 403 

Query: 512 RSHFNL- FRLDDNNELYNI INGYKNTERNI LFSTFVTSRSLYNLLVPFQYLTESEIDDNF 570 

+ +L FR+ D YK+ *■ F+T+ ♦ + + Q D 
Sbjct: 404 KEDGSLGFRVGDEE YKDPVYTPM-GVFITAWARFTTITAAQACY DR1 449 

Query: 571 lYCDTDSLYMKSVVKPIJJIPSLFDPlAI^KWDlENEQIDKMFVLNHKK YAYEVNG 625 

IYCDTDS+++ P + + OP LG W E+ + L K YA BV+G 

Sbjct: 450 IYCDTDSIHLTGTEVPEIIKDIVBPKKI/JYWAKES-TFKRAKYLRQKTYIQDIYAKEVIXj 508 

Query: 626 KI 627 
K+ 

Sbjct: 509 KL 510 



>gijl572479|ewb|CAA65712| (X96987) DNA polymerase tBacteriophage 
GA-1J 

Length » 578 
Score - 46.1 bits (107), Expect - 0.001 

Identities - 80/376 (21%). Positives » 146/376 (38%). Gaps - 54/376 (14%) 

Query: 234 QLT^IHNDVIII/SMCHIHYSDIFPNFDYNKLTFSIJJIMESYLNNEMTRFQLIiNQYQDIKI 293 

+♦ Y+ +D++I+ + +F N D+ +T ♦ + +Y EM + +Y + 

Sbjct: 162 EIEYLKHDLLI VALA- - -LRSMFDN-DFTSMTVGSDALNTY- - KEMLGVKQWEKYFPVL- 214 

Query: 294 SYTHYHPT1DKNFYDYIKSFYRGGLNMYNTKYINKLIDEPCFSIDIHSSYPYVMYHEKIPT 353 

+ 1+ Y+GG M KY ♦ + D+NS YP +M +♦ +P 

Sbjct: 215 SLKVNSEIRKAYKGGFTWVNPKYQGETVYGGMV- FDVNSMYPAMMKWKLLP - 264 

Query: 354 WLYFYEHYSE PTLI PTFLDDDNYFSLYKI DKD VFNDDLLI KI KS RVLRQMXXXXXXXXXX 413 
Y EP ♦ ♦ ♦ LY F + KI ++ 



WO 00/32825 



PCT/IB99/02040 



286 

Sbjct: 265 VGEPVMFKGEYKKNVEYPLYIQQVRCFFELKKDKIPCIQIKGNARFGQNEYLS 317 

Query: 414 XXXXXXXXLRMIQDITG1DCMHIRVNSFVIYECEYFHARDIIFQNYFIKTQGKLIQIKINM 473 

L *T +0 1+ + ♦ I+E E+ +F+ + I 
Sbjct: 318 TSGDEYVDLY VTNVDWELIKKH- YDIFEEEFIGG- -FMFKGF IGF 3S9 

Query: 474 TSPYDYH1TDDIMEHPYSNEEVMLSKWLNGLYGIPALRSHFN- - LFRLDDNNELYNI IN 531 

Y + N S E+ + +K++LN LYG A * LD+N L 

Sbjct: 360 FDEYIDRFMEIKNSPDSSAEQSLQAKLMLNSLYGKFATNPDITGKVPYLDENGVLKFRKG 419 

Query: S32 GYKNTERNILFST- - -FVTSRSLYNLLVFFQYLTESBIDDNFIYCDTDSLYMKSVVKPLL 588 

K ER+ F+T+ + N+L Q L FIY DTDS++++ + + 

Sbjct: 420 ELK- -ERDPVYTPMGCFITAYARENI LSNAQKLYP RF I YADTDS I HVEGLGEVDA 472 

Query: 589 NPSLFDPIALGKWDIE 604 

+ DP LG WD E 
Sbjct: 473 I KDVIDPKKLGYWDHE 468 

>gi|11885l|ap|P06950|DPOL_BPPZA DNA POLYMERASE (EARLY PROTEIN GP2) 
>gi | 75812 | pir | | ERBP2Z DNA-directed DNA polymerase (EC 
2.7.7. 7) - phage PZA >gi|216051 (M11B13) gene 2 product 

[Bacteriophage PZAI >gi| 224741 |prf| | 1112171E ORF 2 

(Bacteriophage PZA) 
Length «• 572 

Score « 45.3 bits (105), Expect » 0.002 

Identities - 98/461 (21%), Positives - 166/461 (35%). Gaps - 110/461 (23%) 

Query: 198 QLKTDFNYTIFI)KDNDh1NDSEAYDYAVXCFAKLTPEQLTYIHNDVIILGMCHIHYSDIFP 257 

++ DP T+ D D + Y ++TP++ YI ND+ 1+ + I 
Sbjct: 129 KIAKDFKLTVLXGDIDYHKERPVGY EITPDBYAYIKNDIQIIAEALL IQF 178 

Query: 258 NFDYNKLTFSLNIMESYLNNBMTR FQLLNQYQD I KI SYTHYHFHDMNFYDYIKSF 312 

+ ++ + ♦ T+ F L+ D Y 
Sbjct: 179 KQ^IjDRKTAGSDDIiKGFICDIITTKKFKXVFPTLSLGLDKEVRYA 222 

Query: 313 YRGGLNMYNTKYINKLIDEPCFSIDINSSYPYVMYHEKI PTWLYFYEHYSEPTLIPT- - P 370 

YRGG N +♦ K I E D+NS YP MY +P Y BP + 

Sbjct: 223 YRGGFTWLNDRFKEKEIGEGMV- FDVNSLYPAQMYSRLLP YGEPIVFEGKYV 273 

Query: 371 LDDDNYFSLYKID KDVFNDDLLIKI KSRVLRQMXXXXXXXXXXXXJCXXJUOtLRMI 42S 

D+D +1 K+ + ♦ IK +SR + 
Sbjct: 274 MDEDYPLHIQHIRCEFELKEGYIPTIQIK-RSRFYKGNEYLKSSGGEIADLW 324 

Query: 426 QDITGIDCMHIRVNSFVIYECEY^HARDIIFQNYFIKTQGKLKNKINMTSPYDYHITDDI 48S 

♦ + +D + + + +Y EY F T G K+ 1+ + 1 

Sbjct: 325 - - VSNVD-LELMKEHYDLYNVEYISGLK- FKATTGLFKDFIDKWTHIKTTSEGAI 375 

Query: 486 NEHPYSNEEVMLSKWLNGLYG IPALRSHFNL-FRLDDNNELYNIINGY 533 

♦ L+K++LN LYG VP L+ + L FRL G 
Sbjct: 376 KQ LAXLMLNSLYGKFASN PDVTGKVP YLKENGALGFRL -— GB 415 

Query: 534 KNTERNIL--FSTTv^SRSLYNIiVPFQYLTESBIDDNFIYCDTDSLYMKSVVKPLLNPS S91 

+ T> ♦ F+T+ + Y ♦ Q D IYCDTDS+++ P + 

Sbjct: 416 EETXDPVYTPKGVFITAWARYTTI TAAQACF DR 1 1 YCDTDS I HLTGTE1 PDVIKD 470 

Query: 592 LFDPIALGKWDIBNBQIDKMFVLNHKKYAY EVNGKI 627 

♦ DPLGNB+ + LKY EV+GK+ 
Sbjct: 471 IVDPIGa/SWAHES-TFKRAKYXiRQKTYIQDIYMKEVDGKL 510 



>gi|243S429 (AF0122S0) unassigned reading frame (possible DNA 
polymerase) (Physarum polycephalum) 
Length - 544 

Score « 44.9 bits (104), Expect * 0.002 

Identities - 118/545 (21%), Positives =» 206/545 (37%), Gaps « 



104/545 (19%) 



Query: 179 TS I ATI^KKLLDGGYLTESQLKTDFNYTI FDKDNDMNDSEAYDYAVKCFAKLTFEQLTYI 238 

T+LKLD+TQ F N M Y +CFL P++ I 

Sbjct: 62 TQLFNLLKSLQDSSFYTFKQ FTYQNIM YSLEISCF- -LYPKKKILI 105 

Query: 239 HNDVIILGMCHIHYSDIFPNFD YNKL- -TFSLNIMESY-LNNEMTRFQLLNQYQD 290 

D+ +1 Y+D+ ++ YN++ ♦♦♦NI Y L+ +♦ + 

Sbjct: 106 -KDLYNFFSENIIYNDVVKDYKLLAILYNEIQTAYNININRKYILSTASLSLRIFKKSFP 164 
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Query: 291 IKISYTHYHFHDMNFYDYIKSPYRGGLKMYKTKYINKLIDBPCPSIDINSSYPVVKYHEK 350 

K + D ♦ +YI+ Y GG N i * * * + d+NS YPY+M EK 

Sbjct: 165 EKYRLI PHLTRDED - - NYIRKSYIGGRNE IFEHVAQRNYFYDVNSLYPYIKKKEK 217 

Query: 351 IPTWLYFYEHYSEPTLIPTPLDD -DNYFS LYKIDKDVFNODUL IKIKSRVLRQ 402 

+P + Y + + F * +N+F L I+K N +L + IK+ V 

Sbjct: 218 MPIGI - - -PEYRDKEYhnCKFEKNl^FFGFlDVLITlEKTNNNIPVXPYRMGIKNNV-EV 273 

Query: 403 MXXXXXXXXXXXXXXXXXXLRMIQOITGIDCMHIPVVSFVrYECEr^FTL^IIFQNYFIK 462 

L + Q 1 + IY + ++++F+ Y ♦ 

Sbjct: 274 GIIYAKGTLRGIYFSEEIKLALKQGYK1IE IYSAYEYKBKEWFEEYVEQ 323 

Query: 463 TQGK-LKMKINWTSPYDYHITDOINEHPYSKEEVMLSKWUJGLYG IPALRS 513 

♦LKK D + D L K +LN LYG 1+ 

Sbjct: 324 KYNRRLKAK DPALKD LYKKLLNTLYGRFGLVYEQIDI ISP 363 

Query: S14 HF11LFRU3DH^LYNIIMGYKKTER1IILFSTFVTSRSLYNLLVPFQYLTESEIDDNFIYC 573 

L + DN ♦ ♦ ++ N++ FYT ++IY 

Sbjct: 364 EXEL-'ITDKTYISHiriTEFIDITANTCYNNIAI^^ 421 

Query: 574 DTDSLYMKSWKFLLHFSLFT>PIALGKVDIENEQIDKM^^ 632 

DTD L++K+ P+ + +L +GK+ +E+ + ?4 N K Y Y +N I 

Sbjct: 422 DTDGLFLKN PIPDIALTTSKEMGKFRLESINAEAHF1AN-KFYIYAPINSPI IYKFK 477 

Query: 633 GIPK NAFDTSVDFETFVR EQFFDGAIIENNKSIYNEQGT ISXYPSK 678 

GIF N D ++ +F+INNY+Q+ I Y + 

Sbjct: 478 GIPUJKPIFNIHDIITQHKKILNITI^HHYFTFSIRU^QTYSFQASRKRKLIP^KTT 537 

Query: 679 TEIVC 683 
I+C 

Sbjct; 538 PWIIC 542 



>gi\ I034487lpirl JS41618 DMA polymerase - slime mold (Physaruin 

polycephalum) >gi | 509721 | dbj |BAA06121.1| (D29637) DNA 
polymerase (Physarum polycephalum} 
Length • 547 

Score - 44.9 bite (104), Expect - 0.002 

Identities = 118/54S (21%) . Positives - 206/545 (37%) , Gaps - 104/545 (19%) 

Query: 179 TSIATI^IOCLI^GQYLTESQLiCrDFNYTXroKDiroKNDSEAYDYAVKCFAKLTPEQLTYI 238 

T+LKLD + TQ F MM Y + CP L P++ I 

Sbjct: 65 TQLFNLLK5LQDSSFYTPKQ -FTYQNIM YSLEISCF- -LYPKKXILI 108 

Query: 239 HNDVIILGHCHIHYSDIFPHFD YNKL- >TFSLNIMESY-UINBMTRFQLLNQYQD 290 

D+ +1 Y4-D+ ♦+ YN++ +++NI Y L+ ++ ♦ 

Sbjct; 109 -KDLYNFFSE>7IIYNDVVia)YKI^ILYNEIQTAYNINIJ^KYIWTASLSLRIFKKSPP 16? 

Query: 291 1KISYTHYHFKDMMPYDYIKSFYRGGWIKYNTICYIHKLIDEPCFSID1IISSYPYVMYHEK 350 

K ♦ D + +Y1+ YGGN I + ♦ + +• D+NS YPY+M EK 

Sbjct: 168 EKYRLIPHLTRDED- -NYIRKSYIGGRNE IFBHVAQRNYFYDVNSLYPYIMKKBK 220 

Query: 351 IPTWLYFYEHYSBPTLIPTPLDD-DNYFS LYKIDKDVFNDDLL- - - IKIKSRVLRQ 402 

+P + Y ♦ + F + +N+P L r+K N +L + IK* V 

Sbjct: 221 MPIGI-»-PBYRDKEYMKlCFBKNIENFFGFIDVLITIEKTNlW 276 

Query: 403 MXXXX30O00ODCXXXXXXXLRMIQDITGIDCMHIRWSFVIYECEYFHARDIIFQNYFIK 462 

L ♦ Q 1+ IY + ++++F+ Y + 

Sbjct: 277 GIIYAKGTLRG1YFSEEIKLALKQGYKI2E IYSAYEYKEKEWFEEYVEQ 326 

Query: 463 TQGK- 14QJKINMTSPYDYHITDDINEHPYSNEEVT1LSKVVLNGLYG IPALRS 513 

+LKK D+D LK+LMLYG I + 

Sbjct; 327 MYNRRLKAK DPALKD LYKKLLNTLYGRFGLVYEQIDI ISP 366 

Query: S14 HFKLFRLDDNNELYNIINGYKNTERNILFSTFVTSRSLYNLLVPFQYLTESEIDDIJFIYC S73 

L ♦ DN ♦ + + ♦ N t- + ++++ FYT ♦ ♦ IY 
Sbjct: 367 EKEL- - ITD^^IYISHI^TEFIDITANTCYNNIAlTSAITSYARIF^^Y^^^IIJ^rYNLHVIYI 424 ' 

Query: 574 DTDSLYMKSVVKPLLJJPSLFD P I A1J3KWDI ENEQIDK^FVTJTHKKY AY - EVNGKX KI ASA 632 

DTD L++K+ P+ + +L +GK+ +E+ ♦ F+ N K Y Y +N I 

Sbjct: 425 DTDGLFLKN- - -PIPDIALTTSiCEMGKFRLESINAEAHFIAN-KFYIYAPINSPIIYKFK 480 



Query: 



633 GIPK KAFDTSVDPETFVR EQFFDGA1 1 ENNKS X YNEQGT ISIYPSK 678 
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GIP ND + ♦ +F+INNY+Q+ I Y ♦ 

Sbjct: 431 GIPI^KPIFNIKDIITQHKKILNITl^HHYFTFSIRLNNNQTYSFQASRKIUCLIPNYKTT 540 

Query: 679 TEIVC 683 
I+C 

Sbjct: S41 PWIIC 54S 



>gi | 4 877819lgbjAAD31446.il (AF133505) DNA polymerase [Neuroapora 
craasa) 
Length » 1035 

Score ■ 44.1 bita (102), Expect ■ 0.004 

Identities » 36/172 (20%), Positives » 82/172 (46%), Gapa - 14/172 (8%) 

Query: 521 DDlWELYNIINGWKNTERNIIJSTFvTSRSLYNLLVPPQYLTESEIDDNFIYCDTDSLYM 580 

+ N EL + ++G K+ I + + ♦ + ++ ++ +-*■♦+ S Y DTDS+++ 

Sbjct s 917 EKNYE LLS YLDGEKDDGF I INSTS IAAATASWSRI LMYKHI I NS A YTDTDSIFV 870 

Query: 581 KSVVXPLLNPSLFT3PIALGKWDIENEQIDKMPVLNHKXYAYEVNC3KIKIASAGIPKKAET) 640 

+ KPL +++ K+ +1+ ♦+ K Y + GK++I GI KN + 

Sbjct: B71 E---KPLDSAFIGEGCGKFKABYNGQLIKRAIPISGKLYLLDFGGKLEIKCKGITKNKDN 927 

Query: 641 TSVDFETFVREQPFDG All ENNKS I YNEQGTIS I YPSKTEI VCGNVYDE 689 

T+ * + E ++G + + E GT+++ K ++ G YD+ 

Sbjct: 928 TTHNIJ5INDFEALYira8SRVLFQERWGRSLELGTVTVKYQKYNLISG--YDK 977 



>gi | 4619S2 | 8p | P33537 | DPOM NEUCR PROBABLE DNA POLYMERASE 

>gi| 283351 |pirT|S26985 probable DNA-directed DNA 
polymerase (EC 2.7.7.7) - Neuroapora craaaa 
mitochondrion plastnid maranhar (SGC3) 
>gi|S781S6|erab|CAA39046j (X55361) putative DNA 
polymerase (Neuroapora craaaa] 
Length - 1021 

Score - 44.1 bits (102), Expect «» 0.004 

Identities - 36/172 (20%), Positives - 82/172 (46%), Gapa - 14/172 (8%) 

Query: 521 DDNNELYNI INGYKMTERNILFSTFVTSRSLYNLLVPFQYLTESEIDDNFrYCDTDS LYM 580 

+ N EL ♦ ++G K+ I ++ + ♦ ♦+ ++++ S Y DTD3+++ 

Sbjct: 815 EKNYELLS YLDGEKDDGFI INSTS IAAATASWSRI LMYKH I INSA YTDTDSIFV 868 

Query: 581 KSVVKPLLNPSLFDPIALGKWDIENEQIDKMFVI^NHKKYAYEVNGKIKIASAGIP 640 

+ KPL + ++ K+ +1+ ++KY + GK++I GI KN + 

Sbjct: 869 E- - - KPLDSAFIGEGCGKFKAEYNGQLIKRAIFISGKLYIJjDFGGKLEIKCKGITIOnaDN 925 

Query: 641 TSVDFETFVREQPFDG AI IENNKSI YNEQGTIS I YPS5CTEI VCGNVYDE 689 

T+ ♦ + E ++G + + E GT+++ K ++ G YD+ 

Sbjct: 926 TTHNLDINDFEALYNGESRVIJQERWGRSLELGTVTVKYQKYNLISG--YDK 975 



>gi|249951l|sp|Q1247lj6P22 YEAST 6 - PHOSPHOFRUCTO- 2 - KINASE 2 
(PHOSPHOFRUCTOKINASB 2 II) (6PF-2-K 2) 
>gi | 2131162 | pir | | 561066 6-phosphof ructo- 2 -kinase (EC 
2.7.1.105) - yeast (Saccharomyces cereviaiae) 
>gi|2131163|pir| 1S71026 6-phoaphof ructo- 2 -kinase (EC 
2.7.1.105) - yeast (Saccharomyces cereviaiae) 
>gi|l085116|emb|CAA6237l| (X90861) 

6-phosphofructo-2-kinaae [Saccharomyces cereviaiae] 
>gi|l420028|emb|CAA99157| (Z74878) ORF YOL136C 
(Saccharomyces cerevisiaej >gi| 1628439 |emb|CAA64733| 
(X95465) 6-phoaphorructo-2-kinase [Saccharomyces 
cereviaiae ) 
Length .397 

Score » 4 0.6 bits (93), Expect = 0.041 

Identities - 48/208 (23%), Poaitivea - 92/208 (44%). Gaps - 29/208 (13%) 

Query: 175 MKTNTSIATLGKKI^DGGYLTESQLKTDFNYTIFDIODNDMNDSEAYDYAVKCFAKLTPEQ 234 

++ S AT+ K LL L+ + + FN K+ND ++ +A++T ++ 

Sbjct: 139 IRRQISCATISKPLL LSNTSSEDLFN PKNNDKKET YARITLQK 181 

Query: 235 LTY- IHNDVIIIXWCHIHYSDIFPNFDYNKLTFSLNIMESYLNNEMTRFQLLN-'- -QYQD 290 

L + I+ND +G+ SI + P + S+ +E++ F L+ Q 

Sbjct: 182 LFH E INNDE CD VG I FDATNST I ERRRFI PEEVCS FNTDELSSFNLVPI I LQVSC 235 



WO 00/32825 



PCT/IB99/02040 



289 



Query: 291 IKISYTHYHFHDMNFY-DYIKSFYRGGLNMYNTICYINKLIDEPCFSID-INSSYPYVMYH 34 8 

S+ Y+ H+ + F DY+ Y ♦ + + + FS+D N + Y+ H 

Sbjct: 236 FNRSFIKYNIHNKSFNEDYU3KPYELAIKDFAKRLKHYYSQFTPFSLDEFMQIHRYISQH 295 

Query: 349 EKIPTWLYFYEHYSEPTLIPTFLDDDNY 376 

E + I T L+Ft- ♦ + P L+ +Y 
Sbjct: 296 EEIDTSLFFFNVTNAGWEPHSLNQSHY 323 



>gi|2258375|gb|AAD11909.l| (AF007261) tranacripcion initiation 
factor sigma (Reclinomonas americana) 
Length - S3 2 

Score » 39-9 bits (91), Expect » 0.070 

Identities - 49/205 (23%), Positives » 84/205 (40%), Gaps - 14/205 (6%) 

Query: "0 miFLUCDTMRYFT)NITRENIYLKSAEENEHTLKMKEATILAKNQNVIL- - -EKRVKSSIN 156 

N+ * ♦ F + ++IY+ + +KE L K NVI + K +K N 

Sbjct: 177 ^LVKNSYIWLFKTVPHDSIYMNYSYIQTPI^ILKEYl^LIKIIhTVI ILQINKNIKKXNN 236 

Query: 157 LDLTMFLNGFKFNI IDNFM- - - KTNTS IATI/3KKLLDGGYLTESQLKTDFNYTIFDKDND 213 

L++++FL F ♦ N*+ K + + + K L Y+T I* T Y K 
Sbjct: 237 U^ISLFLYKFYQEIJCWNYIFINKISRHTQKINIKTLKNSYITFYNLITFIQYYTTKKQRL 296 

Query: 214 MNDSEAYDYAVKCFAK- - LTPEQLTYIHNDVI I LGMCHIHYSDIFPNFDYN- KLTFS LNI 270 

D +K F K P+ +N +1 G* HI+ + N K+T I 

Sbjct: 297 KiQI FYKQI FIKTFLKQHKI PKINKI KNNS LI KYG LTHI YDMI LI S I LRENI KVTLKNRI 356 

Query: 271 MBSYLNNEKTRFQLLNQYQDIKISY 295 

♦ +Y+ T + QY +KI Y 
Sbjct: 357 IFNYMPYITT ISKQY- - VKIGY 376 



>gi|lS734|emb|CAA374S0| (X53370) DNA polymerase (AA 1-57S) 
[Bacteriophage phi -2 9] 
Length » 575 

Score - 39. S bits (90) , Expect - 0.092 

Identities - 41/150 (27%), Positives » 64/150 (42%), Gaps - 36/150 (24%) 

Query: 497 LSKWLNGLYG- IPALRSHFNL-FRLDDNNELYNIINGYKNTERNIL- - F 542 

L+K++LN LYG +P L+ + L FRL G + T+ + 

SbjCt: 381 LAKLMI^SLYGKFASNPDVTGKVPYLKENGALGFRL GEEETKD PVYTFM 429 

Query: 543 STFWSRSLYNLLVPFQYLTESEIDDNFIYCDTDSLYKKSWKPLLNPSLFDPIAI^KWD 602 

F+T+ + Y + Q D IYCDTDS+++ P + + DP LG W 

Sbjct: 430 GVF I TAW ARYTT I TAAQACY DRIIYCDTDSIHLTGTEIPDVTIO)IVDPKKLGYWA 484 

Query: 603 IENEQIDKMFVLNHXKYAY EVNGKI 627 

E+ ++ L K Y EV+GK+ 
SbjCt: 485 HES -TFKRVKYLRQKTYIQDIYMKEVDGKL 513 



Query* pt| 110872 44AHJDORF002 Phage 44AKJD ORF 1 378 9-5732} 3 1 
(647 letters) 

>gi|13S273|sp|P27622|TAGC.BACSU TEICHOIC ACID BIOSYNTHESIS PROTEIN C 
>gi|478126|pirJjD497S7 techoic acid biosynthesis protein 
tagC - Bacillus subtilis (strain 168) >gi| 143727 
(MS7497) putative [Bacillus subtilis] 
>gi | 2636103 | embjCAB15594.il (299122) alternate gene 
name: dinC [Bacillus subtilis] 
Length - 442 

score - 112 bits (278), Expect - 7e-24 

Identities » 91/314 (28%). Positives - 147/314 (45%), Gaps - 58/314 (18%) 

Query: 152 FELNBLEPKFVMGFGGIRNAVNQSINIDKETNHKYSTQSDS QKPEGFWINKLTPSG 207 

P+ + PK V QS N D++ + +Y+TQ S + + I +L+ G 

SbjCt: 7 FD FTN ITPKLFTELRVADICrvl,QS FT4FDEKNHQI YTTQVT^GLGKDMTQSYRITRl^ IjEXS 66 

Query: 208 DLISSMRIVQGGHGTTIGLERQSNGEMKIWLHHD GVAKLLQVAYKDNYVLDLEEA 262 

+ SM ♦ GGKGT IG+E + NG + IW +D ++L+ YK LD E + 

SbjCt: 67 LQLDSKLLKHGGHGTNIGIENR-NGTIYIWSLYDKPNBTDKSELVCFPYKAGATLD-EMS 124 
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Query: 263 KGLTDYTPQSUJIKHTFTPLIDEANDKLILRFGDGTIQVRSRADVKNHIDNVEKEMTIDN 322 

K L ++ H TP +D N +L +R * D KN+ N ++ +TI N 

Sbjct: 125 KELQRFSNMPF- -DHRVTPALDMKNRQLAIR QYDTKNN - ~ NNKQWVTI FN 170 

Query: 323 SE MNDN RWMQGIAVTCDDLYWLSGNSSVNSHVQIGKYSLTTGQKI 3S7 

+ N +N ++QG +D LYW S+ + ♦ 
Sbjct: 171 LDDAIANKNNPLVTINIPDEUfYLQGFFLDDGYLYWYTGDTNSKSVPNL ITV 222 

Query: 3S8 YDYPFKLSYQDGINFPRD HFKEPEGICIYTNPKTKRKSLLLAMTNGGGGKRFH 420 

+D K+ Q I +D NF+EPEGIC+YTNP+T KSL+ + +T+G G R 

Sbjct: 223 FDSDNKIVLQKEITVGKDI^TRYENNFREPEGICMYTNPETGAKSLMVGITSGKEGKRIS 282 

Query: 421 NLYGFFQLGEYEHP 434 

+Y + YB+F 
Sbjct: 283 RI YAYH SYENF 293 

>gi | 142847 (M64050) DNase inhibitor (Bacillus subtilie] 
Length = 12S 

Score » 51.9 bits (122), Expect - le-05 

Identities - 35/116 (30%), Positives - 55/116 (47%), Gaps - 10/116 (8%) 

Query: 152 FELNELEPKFVMGFGGIRNAVNQSINIDKETNHMYSTQSDS QKPEGFWINKLTPSG 207 

F+ + FK V QS N D++ + +Y+TQ S + + I +L+ G 

Sbjct: 7 FDFTNITPKLFTELRVADKTVIiQSFNFDEKNHQIYTTQvASGM^ 66 

Query: 208 DLISSMRIVQGGHGTTIGI^RQSNGEMKIWLHHD GVAKLLQVAYKDNYVLD 258 

♦ SM + GGHGT IG+E + NG ♦ IW +D ++L+ YK LD 

Sbjct: 67 LQLDSMIjLKHGGHGTNIGMENR-NGTIYIWSLYDKPNETDKSELVCFPYKAGATLD 121 

>gij 4038407 (AF103943) factor C protein precursor (Streptomyces 
griseus] 
Length » 324 

Score - 39.1 bits (89), Expect ° o.io 

Identities - 61/269 (22%), Positives - 102/269 (37%), Gaps - 33/269 (12%) 

Query: 172 VNQSINIDKETNHMYSTQSDSQKPEG- - - FWINKLTPSGDLISSMRIVQGGHGTTIGLER 228 

V QS D ♦+ Q S P+ I +L SG+ + M ++ GHG +10 + 
Sbjct: 66 VQQSFTFDIVNI«LFVAQLKSGSPDDSGDLCITQLDFSG>nCLGHMYLLGFGHGVSIGAQ- 124 

Query: 229 QSMSEMKIWLHHIXWAiaJJQVAYKDNYvT^^ 281 

+ +MD + ♦ + + G T SLKHP 

Sbjct: 125 PVGADTYLWTEVD WSMARGTRIJUIFKWNNGATLSRTSSALAXHQPVPGATEMTC 179 

Query: 282 LIDEANDKLILRFGIX3TIQVRSRADVKNHID>A^KEMTIDNSENNDNRWMQGIAVIX3DDL 341 

ID +R+ ++ +V+ V+D QGA+G + 

Sb j Ct : 180 AIDPVNNRMAIRYLTASGRRYGIYNVADIAAGVYDKPLSDVPHPTGLGTFQGYALYGSYV , 23 9 

Query: 342 YWLSGH SSVHSHVQIGKYSLTTGQKIYOYPFKLSYQDGINFPRDNFKEPBGIC 394 

Y L+GN + NS+V + TG + ♦ ♦ G F+EPEG+ 
Sbjct: 240 YQLTGNPYGPDNPNPGNSYVS- -SVDVNTGALVQ RAFTRAGSTL 7FREPBGHQ 290 

Query: 395 IYTNPKTKRKSLLIAKTKGGGGKRFHMLY 423 

IY + + L L +G G R NL+ 

Sbjct: 291 I YRTAAGEVR - LFLGFASGVAGDRRSNLF 318 

Query» ptj 110873 44AHJDORF003 Phage 44AHJD ORF | 6626-8389 | 2 1 
(587 letters) 

>gi|l38123|sp|P04331|VG9_BPFK2 TAIL PROTEIN (LATE PROTEIN GP9) 
>gi| 75850 |pir7|WMBPT9 gene 9 protein - phage phi-29 
>gi| 215327 (M14782) tail protein [Bacteriophage phi-29) 
>gi| 225364 |prf| | 1301270D gene 9 [Bacillus sp.) 
Length - 599 

Score » 92.4 bits (226). Expect - Be- 18 

Identities - 126/618 (20%), Positives « 2S1/618 (40%), Gaps » 71/618 (11%) 

Query: S TNFKFFYNTPFT-DYQNTIHFNSNKERDDYFLNGRHFKSLiDYSKQPY-NFIRDRMEINVD 62 

TN + + PF+ DY+NT F S* + *+P R + ♦ SK + F ♦+ ++V 
Sbjct: 9 TNVRILADVPFSNDYKNTRWFTSSSNQYNWF- - NRKSRVYEMSKVTFMGFRENKPYVSVS 66 
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Query: 63 MOWHDAQGINYMTPLS-DPTDRRYYArVNQIBYVNDVVVKIYrviDTIKrrrQGNVLEQL 121 

+ +Y+ F + D+ ++ +YAFV «.+E+ N V *+F ID + T+ ++ 

Sbjct: 6 7 LPIDKLYSASY1MFQNADYGNKWFYAFWELEFKNSAVTYVHPEIDVLQTWMFDMKFQES 126 

Query: 122 SNWIERQHWXRTWnfMLPMLRNNDDVLKVSNKNY^ 18l 

I R+H+ K + P+D+L+-++ + + ++P S 

Sbjct: 127 F---IVREHV-KLWNDDGTPTINTIDEGWYGSEYDIVSVENHXPYDDMMFLVIISXSIM 182 

Query: 182 FGT--KKEPNLDTSKOTIYDNITSPVNLYVMEYGDFINFMDKMSAYPWrTQNFQK V 235 

GT ++E L+ ++ + + P+ Y+ + ♦ D +1 N V 

Sbjct: 183 HGTPGEBESRLNDINASL-NGMPQPLCYYIHPF YKDGKVPKTYIGDNNANLSPIV 236 

Query: 236 QMLPKDFINTKDLEDVKTSEKITGLKTLKQGGKSKEWSLK-DLSL SFSNLQ 285 

ML P + D+ + +T LK Kt + LK D + N+ 

Sbjct: 237 NMLTNIFSQKSAVNDX -VMMYVTDYIGLfCLOYKNGDKELKLDKDMPEQAGIADDKKGNVD 295 

Query: 286 EMMLSK JGDEFKHMIRNEYMTIEFYOWNGNTMLLDAGKISQK 326 

+ + K KD+ ++ Y E D+ GN M L 1 + 

Sbjct: 296 TIFVKKIPDYEALEIDTGDKWGGFTKDQESKLMMYPYCVTEITDPKGNHMtraiKTBYtNMS 355 

Query: 327 TGVKUITXSIIGYHNSVRVYPVDYNSAENDRPILAKNKEILIDTGSFI^^ 386 

+K++ ♦ +G M+V DYN+ D + S +N N 
Sbjct: 356 K - LXI Q VRGSLGVSNKVA YSVQD YNA - - -DSALSGGNRLTASLDSSLINNNPN 404 

Query: 387 PI LINNGILGQSQQANRQ- - KNAESQLITNRIDNVLNG SDPKSRFYDAVSVASNLSP 441 

I I N L Q N+ +» +S ++ N I G ♦ + A+ +AS++ 

Sbjct: 405 D I AI LNDYLSAYLQGNKNSLENQ KS S I LFNG IMGMIGGGX SAGAS AAGGSALGMAS SV - - 462 

Query: 44 2 TAIJ^KFNBSYNFYKQG^AEYKDLALQPPSVTESEMGNAFQIANSINGLTMXISVPSPKE 501 

T + + QA+ D+A PP +T+ AF N G+ + ♦ 

Sbjct: 463 TGMTSTAGNAVLQMQAMQAKQAD I AN I PPQLTKMGGNTAFDYGNGYRG VYVI KKQLKAEY 522 

Query: 502 ITFUJKYYMLFGFT^YNSFIEPINShTI^CNYLKCTGTYTIRDIDPMLWEQLKAILESG 561 

L ++ +G+++N + + NY++ ■♦ DI + ♦♦■♦••♦■+• I ++G 

SbjCt: 523 RRSLSSFFHKYGYKINRVKK- - PNLRTRKAFNYVQTKDCFISGDINNNDLQEIRTI PDNG 580 

Query: 562 VRFWHNDGSGNPMLQNPL 579 

♦ WHD GN ++N L 
Sbjct: 581 ITLWHTDNIGNYSVBNBL 598 



>gi| 138124 |sp|P07534|VG9_BPPZA TAIL PROTEIN (LATE PROTEIN GP9) 
>gi| 75849 jpirj |WMBP9Z gene 9 protein - phage PZA 
>gi| 216058 (M11813) tail protein {Bacteriophage pzaI 
Length - 599 

Score = 81.9 bits (199), Expect « le-14 

Identities - 127/618 (20%), Positives - 248/618 (39%), Gaps = 71/618 (11%) 

Query: 5 TNFKFFYNTFPT- DYQNTI HFNSNKERDDYFLNGRHFKS LDYSKQPYNFI RDRME - INVD 62 

TN + + PF+ DY+NT F S+ ♦ ++F *■ * SK + R+ I+V 

SbjCt: 9 TNVRILADVPFSNDYKNTRWPTSSSNQYNWF - - NS KTRVYEMS KVTFQGFRENKSYISVS 66 

Query: 63 MQWHDAQGINYMTTLS-DFEDRRYYAFVNQIETYVNDVWKIYFVIDT^ 121 

+♦ +Y+ F ♦ D+ ++ +YAFV ++EY N ++F ID + T+ M+ Q 

SbjCt: 67 LRIiJIiLYNASYIMFQNADYGNKWFYAFVTELE YKNVGTTYVHFE IDVLQW MFNIKFQB 125 

Query: 122 SNVNIERQKLSKRTYNYMLPMLRJTCTODVLKVSNKNYVYN- -QMQQYLENLVLFQSSADLS 179 

S I R+H+ K + P+ D+L ++++ +Y++L S+ 
SbjCt: 126 SF--IVREHV-lCLWNDDGTPTINTIDEGLirYGSEYDIVSVENHRPYDDMMFLVVISKSIM 182 

Query: 180 KKFGTKKEPNLDTSKGTIYDNITSPVNLYVMEY GD FINPMDK 221 

* E L+ ++ + + P+ Y+ + GD +N + 

SbjCt: 183 HGTAGEAESRUroiNASL-NGMPQPLCYYlHPFYKDGKVPKTFlGDNNANLSPIVNMLTN 241 

Query: 222 MSAYPWITQNFQKVQMLPKDFINTK DLEDVKTSEKITGLKTLKQGGKSKEWS 273 

+ + N V M D+I K +L+ K -f G+ KG + 

SbjCt: 242 IFSQKSAVNNI - -VNMYATrDYIGLXLDYKNGDKELKLDKDMFEGAGIADDKHGNVr/TIFV 299 

Query: 274 LKDL SLSFSNIjQEMMLSKIG3EFKHMIRNEYFn'IEFYDWGOTMLLDAGKISQKTGVK 330 

K +L + KD+ Y E D+ GN K L I +K 

SbjCt: 300 KKIPDYETI^IDTGDKMGGFTKDQESKLMKYPYCVTEVTDFKGNHMNLKTEYIDNNK-LK 358 



Query: 331 LRTTCSIIGYHNEVRVYPVDYNSAENDRPILAXNKEILIDTGSFLNTNITFNSFAQVPILI 390 
♦ +G N>V DYN* + L+ ♦ L+T++ N+ + 1+ 
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Sbjct: 


359 


IQVRGSLGVSMKVAYSIQDYNAGGS LSGGDRLTAS LDTSLJNNNPMDIAII - 


409 


yue*y . 


391 




441 






N L Q N+ *N +S N I +L G A + A SP 




SbjCt: 


410 


- NDYLSAYLQGNKNSLENQKSS I LFNGI VGMLGGG VSAGASAVGRS PFGLASSV 


462 


Query: 


442 


TALFGKFNEEYNFYKQQQAEYKDLALQPPSVTESEMGNAFQIANSINGLTMKISVPSPiCE 


501 






T + + QA+ D+A PP +T+ AP N G+ + ♦ 




Sbjct: 


463 


TGMTSTAGMA\^t4QALQAXQADlANIPPQLTKKGGNTAFDYGMGYRGVYVlKKQLXA£Y 


522 


Query: 


502 


ITFl^KTWLFGFBVNDYNSPIEPINS^^^VCKYIiKCTGTYTIRDlDPMLMEQL^CAILESG 


561 






L *■+ +G+++M + + NY+ + +■ DI+ +++++ I ++G 




Sbjct: 


S23 


PvRS LS S FFHKYGYKI NRVKK - - PNLRTRKAYNYIQTKDCFISGDINNNDLQEIRTIFDNG 


seo 


Query: 


562 


VRFWHNDGSGNPMLQNPL 579 








+ WH D GN +♦» L 




Sbjct: 


581 


ITLWHTDDIGNYSVEHEL 598 





>gi|i42923B|emto)CAA67657| (X99260) tail protein [Bacteriophage B1031 
Length = 598 

Score =77.6 bits (IBB). Expect = 2e-13 

Identities - 130/623 (20%), Poeitivee - 240/623 (37%), Gaps - 86/623 (13%) 

Query: 5 TNFKFFYNTPPT - DYQNTIHFNSNXERDDYFLNGRHFKSLDYSKQPYNF1 RPRME IN 60 

TMFM PP+ DY++T F + + YF ♦ K ♦ NF+ I 
Sbjct ! 9 TDVMPSNVPPS1TOYKSTRWPTKADAQYSYF NAXPRVHVINBCNFVGLXEGTPHIR 64 

Query: 61 VDMQWHDAQGIHYMTFLS-DFEDPJIYYAFVNQIEYVITOVVV^ 119 

V* + D YM F ♦ ♦ +♦ tY PV ++EYVN V +YP ID I T+ ♦ 

Sbjct: 65 VNKRIDDLYNACYMIFRKTQYSlJKWFYCFVTRLEYVNSGVTMLYraiDVIQl^ 123 

Query: 120 QLSimrXERQHI£KRTYNYMU>MLRNNDDVLCT^ 179 

QS+BQ+ P+ D+ L ♦ V Q ++F S 

Sbjct: 124 QPSYIVREHQEMWDANNB- - - PLTKTIDEGLNYGTEYDVVAVEQYKPYGDLMFMVCI SKS 180 



Query: 1B0 KKFGTKKBPNLDT5KGTIYBNITS- - - PVNLYVMEYGDFINPMDKMSAYPWXTQNPQKVQ 236 

K T E 61 HI P++ YV + ♦ D S P +T WQ 

Sbjct: 191 KMHATAGET- - - FKAGEIAANINGAPQPLS YYVHPF YBDGSS - - PKVTIGSNBVQ 230 

Query: 237 ML- PKDFINTXDLBDVKTSBKITGLKI LKQGGKSKEWSLKDLSLSFSNL 284 

+ P DF+ ++ + T + +K SL+D ♦ + 

SbjCt: 231 VSKFTDFLiOmPTXJKHAVlWIVSLYVTDYIGLNIHYDESW 290 

Query: 28S QEHMLSKXDEFKHMIRHBYKriEFY DWNGNTMLLDAGX 322 

+E + +F MB + Y 0+ GN ♦ + 

Sbjct: 291 PNVNTIYLKEVKBYEBKTI0TGYKFASFANNEQSKIXMYPYCVTTITDFK^ 350 

Query: 323 ISQKTGVKLRTKSIIGYHNEVRVYPVDYNS-- -AENDRPILAKNKEILIDTGSFLNTNIT 379 

++ + ♦ *G N+V DYN-f D+- + A HT++ 
Sbjct: 351 VNG - SNLKXQVRGSIX*VSN1CVTYSVQDYNADTTLSGDQNLTAS ---CNTSLI 398 

Query: 38Q FNSFAQVPILINKGILGQSQQANRQ- -KNABSQLITMRIDKVLN- - -GSDPKSRFYDAVS 434 

N+ V 1+ N L Q N+ +N + +♦ M + ++L G+ + AV 
Sbjct: 399 NNNPNDVAI I - - NDYIiSAYLQGNKNSLENQKDSILFNGVM5MLGNGIGAVGSAATGSAVG 456 

Query: 435 VASNLS PTALFG KFNBE YNFYKQQQAE YKDLALQ P PSVTES EMGNAFQ I ANS ING LTMKI 494 

VAS S T + + QA+ D+A PP + + A+ M G+ + 

Sbjct: 457 VAS - - S ATGMVSS AGNAVLQI QGMQAKQAD I ANT P PQL VXMGGNTAYD YGNGYRGVYVI K 514 

Query $ 4 95 SVPSPKEITFlKIKYYMLFGFEVNDYNSFIEPINSMIVCmiJCCTGTYT 554 

+ L + +G*+ N ♦ ♦ +• NY++ I 

Sbjct: 515 KQIKEEYRNILSDFSRKYGYJCTNLVK- -MPNLRTRESYNYVQTKDCNIXGNLNNBDLQKI 572 

Query: 5S5 KAILESGVRFWHNDGSGNPMLQN 577 

+ I +SG+ WH D G+ IN 
Sbjct: 573 RTIFDSGITLWHADPVGDYTXNN 595 

>gi|21S339 (M12456) p9 tail protein {Bacteriophage phi-29] 

>gi|224163|prf | (1011232C protein p9,tail (Bacteriophage 

phi-291 

Length - 335 
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Score = 71.0 bica (171), Expect - 2e-ll 

Identities - 64/293 (21%), Positives «■ 123/293 (41%), Gaps - 20/293 (6%) 

Query: 292 KDEFKHMIRNEYMTIEFYDWNGNTMLLDAGKISQKTGVKLRTKS I IGYHNEVRVYPVDYN 351 

KD+ +♦ Y E D+ GN M L 1+ + +G N+V DYN 

Sbjct: S7 KDQES KLMMYPYCVTEITDFKGNHMNLKTEYINNS K - LKIQVRG S LGVSNKVAYSVQDYN 115 

Query: 352 SAENDRPILAKNKEILIDTGSFLNTOITFNSFAQVPILIKNGILGQSQQANRQ- -KNAES 409 

+ D + N+ S +N N I I N L Q N+ +N +S 

Sbjct: 116 A DSALSGGNRLTASLDSSLINNNPN DIAILNDYLSAYLQGNKNSLBNQKS 165 

Query: 410 QLITNRIDNVLNG SDPKSRFYDAVSVASNLSPTALFGKFNEEYNFYKQQQAEYKDLA 466 

♦ + N 1 ++ G ♦ + A+ +AS++ T + + QA+ D+A 

Sbjct: 166 SILFNGIMGM1GGGI SAGASAAGGSALGMASSV- -TGHTSTAGNAVLQMQAMQAKQADIA 223 

Query: 467 LQPPSVTESEMGNAFQIANSINGLTMK1SVPSPKEITFLQKYYMLFGFEVNDYNSFIEPI 526 

PP +T+ AF N G+ + + L ++ +G+++N + 

Sbjct: 224 NI PPQLTKMGGNTAFDYGNGYRGVYV I KKQLKAE YRRS LSS F PHKYGYKINRVXK- - PNL 281 

Query: 527 NSMTVCNYLKCTGTYTIRD IDPMLMEQLKA1 LESGVRFWHNDGSGNPMLQNPL 579 

+ NY++ + DI+ I ++G+ WH D GN ++N L 

Sbjct: 282 RTRKAFNYVQTKDCF ISGDI NNNDLQE IRT I FDNG I TLWHTDN I GNYS VENBL 334 



>gi| 1181968 |erab|CAA87738.1| (Z47794) tail protein [Bacteriophage 
CP-l} 

Length - 230 
Score » 53.9 bita (127), Expect « 3e-06 

Identities - 29/113 (25%), Positives - 54/113 (47%) , Gaps - 3/113 (2%) 



Query: 1 MRKLTNFKFFYNTPF-TOYQNTIHFNSKKERDDYFLNGRHFKSLDYSKQPYNPIRDRMEI 59 

M ++ x + +pp DY N I+F + + +D+P ♦ Y + * + I 

Sbjct: 1 MQESTKIWLYAKSPFKOTDYANVINFETRESMEDFFTKKNPHIEIVYEYDKFQYTQRNGSI 60 

Query; 60 NvBMQWHDAQGINYKTFIjSDFEDRRYYAPvTiQIEYVin)vVVKIYTv^l>TIMTY 112 

V + + + YM F+++ R YYAFV ♦ Y+N+ +1 ♦ +D TY 
Sbjct: 61 WSGRVEKYENVTYMRFINN- -GRTYYAFVFDVLYINEDATRI IYEVDVWNTY 111 

>gi|1181970|emb|CAA87740.l| (Z47794) tail protein [Bacteriophage 
CP-l) 

Length - 586 
score - 42.2 bits (97). Expect - 0.010 

Identities » 79/381 (20%), Positives » 139/381 (35%), Gaps - 92/381 (24%) 

Query: 277 LSLSFSNLQEMMLSK--KDEFK HMIRNF/yMTIEFTDWNGNTMLLOAG KISQKT 327 

L +QE + S KD+ + ++ +E+ IE YD GN+ + 1 + 

Sbjct: 187 LKIAYDQIQEGLRSYMGroDLEIEVQLLHSEFTEIELYDlYGNSYVYQPQYLPRTIDEAH 246 

Query: 328 GVKLRTKSIIGYHNEVRVYPVDYMSAEN DRPIL - 360 

K+ +G N+V + ++YN+A N D+ IL 

Sbjct: 247 KYKVIVSGSLGDSNQVHINFI^YNNANNVSVADKNILDSLESGDWAEHNPEHFKYGLNDV 306 

Query: 361 -AKNKElLlDT-GSFL^ITraSFAQVPILINNGILGQSQQANRQKNABSQLITHRIDll 418 

K+ IL D S++ +♦ Q+ M +L QS + ++ A + ♦ 

Sbjct: 307 TGKSVAILNDAEASYIQST1KNQMEHTQLTFKENRDMLKQSVDLSNKQVATANSQASYNAQ 366 

Query: 419 VLNCSDPKSRFYDAVSVASNLSPTALFGKF NEBYNFYKQQQ- - 459 

S + S N++ L G F N +YN QQ 

Sbjct: 367 F AVDS ANI NQWTEGASG I UTVAGMLLTGNFGGAIXXjLASGGMKVTUAKRDYNDKVVQQGF 426 

Query: 460 AEYKDLALQPPSVTESEMGHAFQIANSIM 488 

A DL QP SV ♦ AFQ N + 

Sbjct: 427 TS ENNALKS Q SNA1ANMKS K I ALDQ S I RAYNATMAD LQNQP ISVQQIGNDLAFQSGNRLT 486 

Query: 489 GLTMKISVPSPKEITFLQKYYMLFGFEVlTDY-NSFIEPINSMTVCNYliKCTGTY-' -TIRD 545 

* K+S+ ♦+ +Y +GVM + N + +S HY+K T+R 
Sbjct: 487 DVYWKVS LAQKE I MGRANEY I XCYGVLVNWFTNDALS VMRSRKRFNYI KMINVNLGTLR - 545 

Query: 546 10 PMLMEQLKAI LESGVRFWH 566 

+ M ++AI +SGVR W* 
Sbjct: 546 ANQSHMNAIQAIFQSGVRIWN 566 
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Query- pt|ll087S 44AHJDORF00S Phage 44AHJD ORF 1 12643-13890 1 -1 1 
(415 letters) 

>gi | 3845203 (AE001399) GAF domain protein (cyclic nt signal 
transduct.) (Plasmodium falciparum] 
Length - 124 5 

.3 bits (123), Expect = 6e-06 

= 59/246 (23%), Positives = 105/246 (41%), Gaps - 27/246 (10%) 

ESIDRJfflGNVDYIGFPKMFLLGNAVNFSSPILSNLNI YNLLQKHXMNTSRLYKNIFLEMR 23 3 
+S D N+ N + + N+V FS+ N IY++L M +YK + E* 

DSSDNNNNNNNNNIJNNNNYNNNNSVIFST NEKIYDML NRDNIYKKVKKEIF 904 

RNDYVNEXIWRAFNSNDDAmTGEFEFNEYOTJUJDNLRiraiNQNGDFFYIKTODKYI - - 2 91 

D++ + +N+M + N N N«- N NGD Y KY 

EGDSIIKTOENKPhnjTNKNYMNNDNXDNNNNNNNNNNIDN^ 964 

KVMYNVTTFMTNI I VVFYTKQYEFCTKIR - DIDNHVTYLRDDMFYKENMERYYYNPSNLH 35 0 

++N ++ + + + K E K+ 1 * L +F+K NM + + L+ 

TSIFNKDLYVKHFVDIIMNXSLEEIIKMNVYISERINSL LFHKGNM LNDVTKLY 1018 

FDNAYSKNYWDNDRYLYLDMNKI 1 KFHIXNEMKKNMSEFERKEKI YEDN YIENTK 406 

NAY + N K I F ♦ E K +M F+ +KIY+ N + N K 

MSNAYGEKCFFFN FPQIKEI IFVNEYEIOCMDMKYFKMLKKIYKYKLNKIFSNNYK 1073 



+++K+ 



|emb|CAB11128.li (Z98SS1) predicted using hexExon; 
MAL3P6.23 (PFC0820w) , Hypothetical protein, lent 4982 aa 
(Plasmodium falciparum] 
Length - 4981 

2 bits (115), Expect - 5e-0S 

- 67/287 (23%), Positives - 110/287 (37%), Gaps - 60/287 <20%) 

ITDLNSATDLXYHSNFIJOT/PIIIYDEFlALEDDYLIDEWDKliKT IYESIDRNHGN 182 

I D+N + D+ + I YD +++DK++ IY +ID++ N 

IMDINKSKDISKKMEIVQN IEYO NKYDKI RNDMDAI YMAIDKDMDN 3664 

VDYIGFPKMFLLGNAVNPSSPILSNLNIYNL LQKHKMNTSRL YKNX FLBMRRNDYV 238 

+ 1 +FL MS *U YNL ++KNRYMF +D 



N N+N++ N N ++N N+ N NG F+ O 

INNNNNNNNNNNNNNNNNNNNNNNNNNNNNN^ 3771 

Query; 299 TFWTNIIWPYTKQYEFCTKIRDIDNHVTYIjRDDMFYKENMERYYYNPSNL^ 358 

K FCTK ++F +N+B N N H Y+ N 

Sbjct: 3772 KDLFPCTK KNIFPCKNIETVCKNEYNKKIYNNYTCN 3807 

Query: 359 YVVDNDRYLYLDMNKI IKFHIKNEHKKNMSEFERKEK- XYECNYIEN 4 04 

V+N + ++IK ♦ ♦ N E+ ♦ BK +Y + EN 

Sbjct: 3808 ISVNNTl^CLNIIKELIKUINNKKKILNYYEYHKVEKIAYYRHSFEN 3854 

Score • 35.6 bits (80), Expect -0.70 

Identities o 62/290 (21%), Positives > 121/290 (41%), Gaps - 6S/290 (22%) 

Query: 2 VKQNRLDMVRDYQNAVN--HVRKKIPDKYNQIELVT)Eli!NDDlDYYISISNRSIX3K5FlIY 59 

+K+N +♦ +N +N +Vf+ DK N I D-M-I+ SN + +SF 

Sbjct: 4445 IKRNN1NKSNIKRNNINKSNVKRSNTDKSNVIS DFHIT-SNNNITRSFT- 4492 

Query: 60 VSFFIYIAIKLDIKFTLLSRKYTLRDAYRDFIEEIIDENPLFKSKRVTFRSARDYLAIIY 119 

A D F LS TL *Y +F + ♦ I 
Sbjct: 4493 ATLTDSIFNTLSE- -TLNYSYDNFFSNMDN IKI 4523 

Query: 120 QDKEIGVITDLNSATDLKYHSNFLKHYPIIIYDEFL ALEDDYLIDBWDKUCTIYE 174 

♦ El ITD++ +YH N+LK + +£♦+ ♦ +D + DB «-+T+ E 

Sbjct: 4524 KKNEINNITDVDYGNKKEYHENYLKVKQNKVNEEYIEETFKSDKDCSIKDEACTIRTLSE 4583 



Score 


- 52 


Identities 


Query: 


174 


Sbjct: 


854 


Query: 


234 


Sbjct: 


905 


Query: 


292 


Sbjct: 


965 


Query: 


351 


Sbjct: 


1019 


Query: 


407 


Sbjct: 


1074 


>gi|3758843| 


Score 


« 49. 


Identities 


Query: 


127 


Sbjct: 


3619 


Query: 


183 


Sbjct: 


3665 


Query: 


239 


Sbjct: 


3723 
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Query: 175 S- - IDRNHGMVDYIGFPKMFLLGNAVNFSSPILSNLNIYNLLQKHKMN- -TSRLYKNIFL 230 

SIN N+D ♦ + + S P N++ N + +K+ +N R+ KN 

Sbjct: 4S84 SCNISBN1SNID MDDEDHISFPNGRNVHDNNYMKKNHVMYDKMRVGJQiKIP 4634 

Query: 231 EMRRhTOYVNEKRNTRAFNSNDDAMTTGEFEFNEYNLADDNLRKHINQNGD 280 

D + *■*■+ + +D M++ ++ E ++ ■»• L + NG+ 
Sbjct: 4635 SFTHFDKILDEKKKK SDKDMSSS KWLEREEK I KE I KLE KNE YMNGN 4680 

Score - 34.0 bits (76), Expect - 2.0 

Identities - 47/211 (22%), Poaicivea - 84/211 (39%), Gaps » 32/211 (15%) 

Query: 210 IYNLI^KHK>OTSRLYKNIFLEMPJ^YVNEKJttTTtt 269 

I++LLQK LY+N+ + R ♦ N+ T E ++ + ++ 

Sbjct: 918 IFSLLQKDSSPLLVLYENVHI REGEKYGRNS - - ATDNEVDYKKGDI I KH 964 

Query: 270 NLRNHINQNGDFFYIKTD- - -DKYIKVMYNVTTFMTNIIVVPYTXQYEFCTKIRDIDNHV 326 

N+ N + D ♦ D+ K MY ♦ V E K D+- N + 

Sbjct: 965 NVTNEHGNHSDSYPYGMSLNLDRKPKNMYE- DI YKEKGFVKSDCSNIEI ■• - KXNDMINND 1021 



Query: 327 TYLJU3DMPYKENMERYYYNPSNI^FDNAYSK>rYVVD KFHIKNE 382 

Y FY+++ Y+ ♦ YV++ +YL ♦+ P +KN+ 

Sbjct: 1022 VYKKNE - PYEDSRINMI YDEDEI KTWFLI PHKYV IN - - - 1 I YLFLNI LLTDESNFKLKNK 1077 

Query: 383 KKKNMSEFERKEKIYEDN YIENTKKY 408 

E K IYEDN ++N KKY 

Sbjct s 1078 KYGYFVNBETKGTI YEDNNGLQE ILKNGKKY 1108 

Score -33.6 bits (75), Expect » 2.7 

Identities - 42/198 (21%), Positives » 77/138 (38%), Gaps - 42/198 (21%) 

Query: 222 SRLYKN I PLEMR RNDYVNEKRNTRAF NSNDDAMTTGE FEPNEYNLA 267 

S LY I++ + +N ♦ K+NT + N +++D TT E + ♦ 

Sbjct: 411 SVLYSIIYMMKKYXKKNFIITNKKNTNVYPENDVIQLSVBNTSEDTFTTOTR^ 470 

Query: 268 DDKLRNHINQNGDFFYI1CTDDKYIKVMYNVTTFMTNI I WPYTKQYEFCTKIRDIDNHVT 327 

+++R +N D +DDK ++Y N YTK E 
Sbjct: 471 MNDMRYSVNHYADEKVYHSDDKSOHLIYXHVHDEKNKYDEMYTKTKB - 517 

Query: 328 YUIDDMFYKENMERYYYNP S NLHFDNAYSKNYWDNDRY L YLDMNKI I KFH I KNEMKKNM 387 

+++ YK N+ + H K LD+ K I H+KN+ + N 

Sbjct: 518 - -NBNTIYKSNIVDJQCTCDISSEMVNGKDX LDVEKYIGSHVKND-ENNK 563 

Query: 388 SEFERK-EKIYEDNYIEN 404 

+ ++K + + + YI+M 
Sbjct: 564 EKLKKKIDNVNKKEYIDN S81 

>gi| 3845297 (AS001421) hypothetical protein (Plasmodium falciparum] 
Length - 2380 

Score - 48.0 bits (112), Expect - le^04 

Identities - 87/390 (22%), Positives » 160/390 (40%), Gaps - 65/390 (16%) 

Query: 20 VRKKIPDKYHQIELVDELMNDDIDYYISISNRSDGKSFNYVSFF IYLAIKLDIKF 74 

+++K +K ♦+ + +N D + ++ R K+ NY++ +YL I OI 

Sbjct: 104 9 LQRKHNNKCSKNRNRNRYINraSMIHI^ 1108 

Query: 75 TLLSRHYTLRDAYR DFIEEIIDEN- PLFKSKRVTFRSARDYLAI IYQDKEIGVI 127 

+Y Y * + ♦ EN + ♦++ ++ + Y +K+ 

Sbjct: 1109 QFNIQINYNVQNFYNFSITLINIMSKYYSENFYAYNI^KIVYKFLLNNKNFEY 1168 

Query: 128 TDLNSATDLKYHSNFLKHYPI IIYDEPLA LEDDYLI DSHD KLKTI YES IDRNKGNV 183 

D+N D+ ++ +K+ II EFL L+ D I ♦ KLKT ++ 
Sbjct: 1169 EOMNEL-DILVNTYDMKYDKII- - -EFLKNNGYLXIDRYIYFYPKLKT DI 1214 

Query: 184 DYIGFPKMFLLGNAVNFSSPILSNLNIYNLLQKHKMNTSRLY KNIP--LEMRRN 235 

F ++FL N + L NI +♦+ K + Y K IF + M+ ♦ 

Sbjct: 1215 ILFFFXEIFLNDNILKIDRKFLIG<-NITIMIEVLKEIFFKEYVlG^CITKVIFFPVHMltEH 1273 

Query: 236 DYVKEKR NTRAFNSNDDAMTTGEFEFNEYNLADDNLRNHINQNGDFPYIKTD 287 

D+V K N+ FN* D + N YN D+ N+ N M +Y K 
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Sbjct: 1274 DHVMNKNYYNNQYVNNSNMFNTRGDHNNNNQTl^ - KNK 1332 

Query: 288 DKYIKVMYNVTTFMTNIIV VPYTKQYEFCTKIRDIDNHVTYLRDDMPYKEN MB 340 

+K K+MY + V K + K I + Y+++ N + 

SbjcC: 1333 NKN-KIMYEKERKSSSLPISNNVQDVKFIKHYLKYSSIYKNFIYIISEIKNFNNKITKIN 1391 

Query: 341 RY-YYMPSNLHFDNAYSKNYWDNDRYLYL 369 

RY YYN NL+ D* ND YL+L 

Sbjct; 1392 RYNYYNYMNLNIDDL NDAYLFL 1413 



Score 3 32.5 bits (72), Expect » 6.0 

Identities = 46/183 (25%), Positives » 73/183 (39%), Gaps ■ 26/183 (14%) 

Query: 225 YKNIFLEMRRNDYWEKRNTRAFNS^DAMTTCT^ 294 

+KNI ++K ♦ NSN + + N N+ *N N IN ♦ I 

Sbjct: 27 HKNINKNIKKKKFINIDNSNNCNNSNSNNSNSNNNNNNNNNIVTOJN-NNFINADKKXNVI 85 

Query: 2 85 KTDDKYIKVWYNVTTFWTOIIVVPYTKQYEFCTKIPJJIDMHvTYL^ 344 

+D IK V NI Y ++ ♦ D+ N+ + + KE ER 

Sbjct: 86 LNEDDDIKNKELVDESFVNIFP- -YENYFKNLFNLNDVSNNKVI - -NIIEQKEGDER- - - 138 

Query: 345 NPSNLHFDNAYSKNYWDNDRYLYLDMWCIIICFHIKIIEM^^ 404 

N N M +KNVBN +NK IKN +N++E Y N++ ♦ 

Sbjct: 139 NADN NIKNKNIVRDN INK I KN - - TRNVNBI L1YNNKYI INFLND 180 

Query: 405 TKK 407 
T K 

Sbjct: 181 TTK 183 



>gi|4493936 | emb | CAB3 8972 . 1 | (AL034556) predicted using hexExon; 

MAL3PS.6 (PFC0600w). Hypothetical protein, len: 250 aa 
[Plasmodium falciparum] 
Length - 249 

Score - 47.3 bits (110). Expect - 2e-04 

Identities » 53/215 (24%). Positives » 87/21S (39%), Gaps . 30/215 (13%) 
Query: 209 niynllqkhkmntsrlyknifle>irrneyvnek^ 266 

NIYN L++ YKN H ♦+ +N N+N EFE N YN 

Sbjct: 13 NIYNKLEEK YKNFLKLKNMNSHMGASQKMNV - NNNYTMNELEEFBKINNNYNN 64 

Query: 267 ADD NLRNH I NQNGD FFYI KTD DKYI KVMYNVTTFMTNI I VVPYTKQYEFCTK1RD 321 

++N+ N+IN D+ IK +K ++ YN +1 T 

Sbjct: 65 NNNNINNNI NNYYDYMNI KVS QS VQHNKRLQD PYNNKNS FQHYI KKLKTCRFD ADD IRNL 124 

Query: 322 IDNHVTYLRDDHPYK ENMERYYYNPSNLHFDNAYSKNYVVDNDRYLYLDMNKI I K 376 

++ + Y RD+ K EN + N + N+SNY 0N+ LY +N++ K 

Sbjct: 125 LEKRLAYERDNTItlKNIQBBENKJCGIGINGNFGSESNSSSSNY- - DNNYLLYRKINRLNK 182 

Query: 377 FHIP31EMKKNMSEFERKEKIYEDNYTENTKKYLMK 411 

♦ ++ KI KKY++K 
Sbjct: 183 TNTNKS KNRSRKRKRIN5 KI DKKYIIK 209 

>gi 1 3845165 (AB0O139O) hypothetical protein {Plasmodium falciparum) 
Length - 1247 

Score » 45.7 bits (106), Expect • 6e-04 

Identities « 52/239 (21%), Positives - 94/239 (38%), Gaps » 38/239 (15%) 

Query: 206 SNLNI YNLLQKHKMNTSRLYKNI FLEMRRNDYVNEKRNTRAFNSNDDAKTTGEFEFNBYN 265 

+N N +N ++K K R I +N + +N ++N+D EN N 

Sbjct: 4 74 NNTNICWNBIKKRKKKFKRBKNKI XNNSFQNQEAEDDKNNNNNDNNNDNKNDNNNENNNEN 533 

Query: 266 LADDNLRNHINQNGDFPYI-KTDDKYIK VMYNVTTFMTNI I WPYTKQYEFCTKIR 320 

D+N N+ + N D I D+ Y +YN T ++ YTK ♦ ♦ + 

Sbjct: 534 NNDNNNENNNDINNDINNIHNNnNNYYNNDNINLYNEMriOCKCMLDNSYTK^ 592 

Query: 321 D I DNHVTYLRDDMFYKENMB RYYYN PSNLHFCNAYS 356 

+ + ++ ♦ FY++N ♦ ++YYN + N 

Sbjct: 593 DMLPS I KFETFYEKNTDKKNFNENYKFYYNTDDDTD I I NAI KKKNVKNKKKNGNI VI 649 



Query: 



357 KNYVVDNDRYLYLDMNKIIKFHIKNEHKKNMSEPER-- 
KNY+ N+ Y YL+ N+ + I + K +E 



- - KEKI YEDNYI ENTKKYLMK 411 
K+ 1+ ♦♦YE K K 
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Sbjct: 650 KNYINHNE- YSYlJiWENKNYEINKKEKI^TENYEY 707 
Score - 41.0 bits (94), Expect » 0.016 

Identities = 58/245 (23%), Positives - 96/245 (3B%) . Gaps - 43/245 (17%) 

Query: 207 NIA'IYNLLQKHKMNTSRLYraJIFLEMRRhTOYVNEKRNTRAFNSNDDAh^ 266 

N+N+YN + KK Y F + D + ♦ + ND E YN 

Sbjct: 564 N INLYNEMTKKKCMLDNS YTKYF FYI FT LDHL PS I KFETFYEKNTDHKNFNENYKFYYNT 623 

Query: 267 ADD NLRNHINQNGDFF- - -YIKTDDKYIKVMYNVT-TFMTNIIVVPYTKQ 312 

DD N++N +NG* YI ♦+ Y + YN + N T+ 

Sbjct: 624 DDDTDIINAIKKKNVlWK-KKNGNrviKNYIbnME-YSYLEYNEHKNYEINKKEKLLTEN 681 

Query: 313 YEFCTKIRDIDMiVTYLRDDMFYKENMERYYYNPSNLHFDNAYSK NYV--VD 362 

YE+ I+D ++ Y D + + YN +N +N Y K +Y+ VD 

Sbjct: 682 YBYDMYIKDNIHYNDYSEGDGKQTKKASSFLYNNNN NNKYKKEDNXTQIISYMDHVD 738 

Query: 363 KDR YLYLDMNKIIKFHIK-NEM KKNMSEFERKEKIYEDNYI8NTKKY 408 

N+ Y + F +K N+M K+ P +E X ♦ +EN K+ 

Sbjct: 739 MENGVKGLKKRNLFYNNSDQLYNFDVXDNDMIKYEKRQSKNFVEEEFIN(3NRroiENEDKH 798 

Query: 409 LMKQY 413 
L K Y 

Sbjct: 799 LKKHY 603 



Query- pt| 110877 44AHJDORF007 Phage 44AHJD ORF | 2044-3027(1 1 
(327 letters) 

>gi 1 1181960 1 emb |CAAa7731.l| (Z47794) connector protein 
{Bacteriophage CP-1] 
Length =337 

Score - 45.7 bits (106). Expect •» 5e-04 

Identities » 44/184 (23%), Positives - 84/184 (44%), Gaps = 13/184 (7%) 

Query: 127 QIHKLYDNCMSGNPVVT4QNKJ?IQYNST)IEIIEHYTDEIJVEVAI*SRFSIiIMQAKFSK-- IP 184 

++HK + + +V+ N Y I +E + ++LA++ 1,+ L A+ IF 
Sbjct: 125 ELHKDNPDKI KRP CI VI PNNNF - VE PYIGYLELFCEKLADI ELT - IQLNRNAQITPYFI F 182 

Query: 18S KSEINDBSINQLVSEIYNGAPFVXMSPMFNAD DDI IDLTSNSVI PALTEMKR 236 

N S+ + ++I N P V ++ + D D I + L ++ 

Sbjct: 183 ADNTNVLS HKNI FNKIANFE PVVYLNKQKDQDGQDS FKQLSDYIQVFRTDAP FLLDKLHD 242 

Query: 237 EYQNKISELSNYLGINSIATOKESGVSDEEAKSNRGFTTSNSNIYIJCGR£P-ITFLSKRY 295 

E +f+L ++GIN+ DK+ + EA SN G ++N + K R + ++K Y 
Sbjct: 243 EKWWMNQIATFIGINNNPSDIO^RLWSEAISNNGVIS^ 3 02 

Query: 2 96 GLDI 299 
GL+I 

Sbjct: 303 GLEI 306 



>gi { 1429239| emb | CAA67658 | (X99260) upper collar protein 
laacteriophage B103) 
Length - 308 

Score - 44.9 bits (104), Expect - 8e-04 

Identities - 40/159 (25%), Positives » 73/159 (45%), Gaps - 11/159 (6%) 

Query. 150 YNSDIEI IEHYTDELAEVA-LSRFSLIMQAKFSKIFKSEINDESINQLVSEIYNG 203 

YN+D++ +E + +LAE+ + + Q I ++ N S+ + ++ 

Sbjct: 121 YNNDLXCSTL P ALEMFAQ DLAELKE 1 1 AVNQNAQ KT P VLI AANDNNQ LS LKN I YNQYEGN 180 

Query: 204 APFVKMSPMFNADD - DI IDLTSNSVI P ALTEMKRE YQNKI S ELSNYLG I NS LAVDKESGV 262 

AP ♦ ♦ *D++ +V+LK N E+ YLGI ♦ >*K+ ♦ 
Sbjct: 181 APVI FVHESLDLDNLKVFKTDAP YWDKLNAQKNAVWN EVKTYLXJ I KKANLEXKERM 237 

Query: 263 SDEEAKSNRGFTTSNSN2 YLKGR- EPITPLSKRYGLDIX 300 

E SN S+ NIYLX R B +S>- YGL++K 

Sbjct: 238 VTSEVDSNDEQIESSGNIYLKARQEACNKISELYGLNLK 276 



>gi| 137915| ap| P07S3S |VG10_BPPZA UPPER COLLAR PROTEIN (CONNECTOR 

PROTEIN) (LATE PROTEIN GP10) »gi | 7S851 )pir| | WMBP10 gene 



WO 00/32825 



PCT/IB99/02040 



298 

10 protein - phage PZA »gi J 216059 (M11813) upper collar 
protein (Bacteriophage PZA] 
Length - 309 

Score - 43.8 bits (101), Expect = 0.002 

Identities - 38/160 <23l) , Positives = 75/160 (46%), Gaps = 13/160 (8%) 

Query: 150 YNSDIEI IEHYTDELAEVALSRFSLIMQAKFSKIF--KSEINDESINQLVSEIYN 202 

YN+D+ +E + ELAE+ S+ A* ♦ + +♦ N S+ Q+ ++ 

Sbjct: 122 YNNDMSFPTTPTLELFAAELAELK -EI I S VNQNAQKTP VLIRANDNNQLSLKQVYNQYEG 180 

Query: 203 GAP FVKMS PMFNADD - D I IDLTSNS VI PALTEMKREYQNKISELSNYLG INSLAVDKESG 261 

AP + ++D + V+ L K N E + +LGI + ++K+ 

Sbjct: 181 NAPVI FAHEALDSDS I EWKTDAPYVVDKLNAQKNAVWN - - - EMMTFLG I KN ANLEKXER 237 

Query: 262 VSDEEAKSNRGFTTSNSNIYLKGR-EPITFLSKRYGLDIK 300 

+ +8 SN S+ ++LK R E ♦++ YGLD+K 

SbjCt: 238 MVTDEVS SNDEQ I ESSGTVFLKSREEACE KINELYGLDVK 277 

>gi| 137914 |ap|P04332|VG10 BPPH2 UPPER COLLAR PROTEIN (CONNECTOR 

PROTEIN) (LATE PROTEIN GP10) »gi | 75852 |pir| | WMBPC9 gene 
10 protein - phage phi- 29 >gi|21S328 (M14782) upper 
collar protein (Bacteriophage phi- 29) >gi| 215340 
(M124S6) plO connector protein (Bacteriophage phi -29) 
>gi | 224161 |prf | |1011232A protein plO, connector 
{Bacteriophage phi-29) >gi | 225365 | prf [ | 1301270E gene 10 
(Bacteriophage phi-29) 
Length - 309 

Score « 41.4 bits (95), Expect » 0.009 

Identities - 37/160 (23%). Positives - 75/160 (46%), Gaps • 13/160 (8%) 

Query: 150 YNSDIEI IEHYTDELAEVALSRFSLIMQAKFSKIF- -KSEINDESINQLVSBIYN 202 

YN+D+ +E + ELAE + St- A+ ♦ + +♦ N S+ Q+ ++ 

SbjCt: 122 YNNDMAF PTTPTLELFAAELAELK - EI I SVNQNAQKTPVLI RANDNNQLS LKQVYNQYEG 180 

Query: 203 GAPFVXMSPMFNADD-DIIDLTSNSVIPALTEMKREYQNKISELSNYLGINSLAVDKESG 261 

AP ♦ ++D ++ ♦ V* L K N E+ *LGI + ++K+ 

SbjCt: 181 NAPVT FAHEALDSDS I EVFKTDAPYVVDKLNAQKNAVWN- - - EMMTFLG I KN ANLEKKER 237 

Query: 262 VSDEEAKSNRGFTTSNSNIYLKGR - EPITFLSKRYGLDI K 300 

+ *B SN S+ +*LK R E **+ YGL++K 

SbjCt: 238 MVTDBVSSNDEQIESSGTVFLKSREEACEKINELYGLNVK 277 

Query- pt| 110878 44AHJDORF008 Phage 44AHJD ORF | 3020-3775 |2 1 
(251 letters) 

>gi|4982468|gb|AAD30963.2| (AF11B151) SNF1 /AMP -activated kinase 
(Dictyostelium discoideum] 
Length « 718 

Score - 52.3 bits (123), Expect » 3e-06 

Identities - 28/118 (23%) , Positives - 56/118 (46%) , Gaps - 5/118 <4%) 

Query: 121 YLQSCOFTEHNEDTTSNTDETSNQNATSLDNSTGMTANRNAYV S LPQS EVN I DVDN 176 

+ ♦ GF N ♦+ SN + +N N + N+ T N N + ♦+ + +N + +N 
Sbjct: 382 FTTTTGFNPTNSHSISNNNNNNNNNNNNTTNNNN^ 441 

Query: 177 TTLRFADNNTIDNGKTVNK^SNESNQNAKRNQNQKGNAKGTQFTKQYLID - NIDKAYD 233 

♦NN I+N N ++N +N N N N N+ + T+ + I N++ +Y+ 
Sbjcti 442 NNNN I NNNN 1 1 NJ^rW^mT^^WNNNNNNNNNNNN SSI SGGTEVF SIS PNLNNSYN 499 

Score .37.5 bits (85), Expect - 0.094 

Identities - 17/111 (15%). Positives » 45/111 (40%) 

Query: 130 HNBDTTSNTDETSNQNATS LDNSTGMTANRNAYVSLPQSEVN IDVDNTTLRFADNNTIDN 189 
+N + +N ♦ +N N ♦ +N++ * P + + +++ N+ ++ 

sbjct: 456 innrtmmmmnmnmmmmjss i sggtevfs i s pnlnns ynsnssgnsngsnsnnns sis 

Query: 190 GKTVNKSSNESNQNAKRNQNQKGNAKGTQFTKQYLIDNIDKAYDLRKKILN 240 

N +N +N N N N N ID+++ * + + N 

SbjCt: 516 Mn'^mmwmnftTOTOMMffmtlDSVlfflSMJENDVIJlJ 566 
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Score - 32. a bita (73), Expect - 2.4 

Identities » 31/140 (22%), Positives * 57/140 (40%), Gaps = 14/140 (10%) 

Query: 109 LNVVYSSSEVEKYLOSQGFTEHNEDTTS NTDETSNQNATSLDNSTGMTAWRNAYVSL. 165 

S N +T + N + +N N + +N+ N N ♦ 

Sbjct: 4 94 I^SYNSNSSGNSNGSNSNNNSNNNTNNDhJNNN>JNNNfmN^ 553 

Query: 166 PQSEVN- - IDVDNTTLRPADNNTIDMGKTVNKSS NESNQNAKRNQNQKGNAK 215 

+ +N DV+N+ + +NN D+G N ++ N N + M GN 

Sbjct: 554 VNNSUWENDVNNSNINNNNNNNSDDGSNNNSYEGGGDV^ 613 

Query: 216 GTQFTKQYLIDNIDKAYDLR 235 

Q L++++D D++ 
Sbjct: 614 MLNNNFQ-LLNSLDLNSD1Q 632 

Score » 31.7 bits (70), Expect =5.4 

Identities - 25/115 (21%) , Positives = 48/115 (41%), Gaps = 10/115 (8%) 

Query: 130 HNEDTTSNTDETSNQNATSLDNST GMTAN-RNAYVSLPQSEVNIDVDNTTLRFADNN 185 

♦N + +N + +N N +S+ T ♦+ N N+Y S S N ♦ M+ +N 
Sbjct: 462 NNNNNNNNNNNNNNNNNSSISGGTEVTSISPNLNNSYNS--NSSGNS^ 519 

Query: 186 TIDNGKTVNKSSNESNQNAKRNQNQKGNAKGTQFTKQYLIDNIDKAYDLRKKILN 24 0 

DN N ++N +N N N H N + ++++ D+ +N 

Sbjct: 520 NNDN NWINNNNNNNNNNNNNNNNNNNNNNNCIDSV^ 570 

Score - 31.7 bits (70), Expect » 5.4 

Identities - 15/104 (14%), Positives « 43/104 (40%) 

Query: 110 NWYSS S EVEKYLQ SQG FTEHNEDTT S bTOET SNQNATSUDNSTGOTANRNA YVS LPQSE 169 

N+ +++ + + +N + +N + +N N + +N+ * + V 

Sbjct: 434 NIKNNNTOWl^INNNNIINNOT^^ 493 

Query: 170 VNI DvTJNTTLRPADNNTlDNGKTVMKSSNESNQNAXRNQNQICGN 213 

+N ++ + ++ + +N N +■++ +M N N N N 
Sbjct: 4 94 UWSYNSNSSGNSNGS^JSNNNSNNOTNNENNNNNNNNNNNNNNN 537 

Score = 30.9 bits (68), Expect » 9.2 

Identities - 16/84 (19%), Positives - 34/84 (40%) 

Query: 130 HNEDTTSrn^ETSNQNATSIJDNSTGOTANRNAYVSLPQS 189 

+N + *N + +N N + +N+ + S+ ♦ N N++ +N+ +N 

Sbjct: 4S5 NNNNNNNNNNNNNNNNNNNNNNNNSSISGGTEVTSIS^ 514 

Query: 190 GKTVNKS SN ESN QNAXRNQNQ KGN 213 

+ N +N N N N M 
Sbjct: 515 SNNNTNNDNNNNNNNNNNNNNKNN 538 

>gi j 1730077 J sp | P18160 | KYK1 DICDI NON - RECEPTOR TYROSINE KINASE SPORE 
LYSIS A (TYROSINE- PROTEIN KINASE 1) >gi| 974334 (U32174) 
non-receptor tyrosine kinase (Dictyosteliuro discoideum) 
Length » 1584 

Score - 46.5 bits (108), Expect - 2e-04 

Identities - 29/106 (27%). Positives - 48/106 (44%), Gaps » 4/106 (3%) 

Query: 130 HNEDTTSNTDETSNQNATSLDNSTGMTANRNAYVSLPQSEVNID- - -VDNTTLRFADN-N 185 

♦NED +SN + +N N ♦ +N+ N N ♦ + N + ++NTT N N 

Sbjct: 442 ^lNEDISSNNNN^INNNNNNNNNNNNNNNNNNNNNNNNNSNSS 501 

Query: 186 TIDNGKTVNICSSNESNQNAKRNQNQKGNAKGTQFTKQYLIDNIDKA 231 

+N N +SN +N N N N N TK+ I + D++ 

Sbjct: 502 NNNNNNNSNSNSNSNNNNINNNNNNNNNNNNIYLTKKPSIGSTDES 547 

Score » 34.0 bits (76), Expect = 1.1 

Identities - 20/117 (17%), Positives - 46/117 (39%) 

Query: 87 NRQTVEAFGMQVITVCITOEDYIjNVVVSSSEVZKYLQ 146 
N G IT T ++++♦♦ + +N + +N + +N N 
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Sbjct: 415 NNNNNNIIGNGKITTTTTTSTSPSSI^EDISSHNN^^ 474 

Query: 147 TSLDNSTGMTANRNAVVSLPQSEVNIDVDmTLRFADNNTIDNGKTVNKSSNESNQN 203 

+ T N N + ♦ N + N+ +tf N ++N +N N 

SbjCt: 475 m^SNSSJTOn^INJnrifl^SNSNmWNNNSN 531 

Score » 33.2 bits (74), Expect -1.8 

Identities - 18/88 (20%), Positives » 35/88 (39%) 

Query: 130 HNEDTTSlTrOETSNQNATSLDNSTGMTAHRNAYVSLPQSEVNIDVDNTTliRFADNMTIDN 189 

+N ♦ ++N + +N K T T + S+ +E +N +NN +K 

SbjCt: 405 NNJRWSNNNNNNNNNNIIGNGKITTTTTTSTSPSSINNNEDI 4 £4 

Query: 190 GKTVNKSSNESNQNAKRNQNQKGNAKGT 217 

N ++N +N N+ + NT 
SbjCt: 465 NNNNNNNNNNNNNNSNSSNTNNNNINNT 492 

Score =32.5 bite (72). Expect - 3.1 

Identities » 18/94 (19%), Positives - 37/94 (39%) 

Query s 120 KYWSQGrTEHMEDTTSKTDBTSNQNATSIJDNSTGffrAOTUIAYVSLPQSBVllIDVDNTTL 179 

K + S N ♦ +N++ +N M ++ + VT S N D+ + 

Sbjct: 392 rarWJSTSILVPNGMNNroiSNNMnraNN^ 451 

Query: 180 RFAD1WTIDNGKTVNKSSNESN0NAXRNQNQKGN 213 

+NN +N H ++H +N N + + N 
SbjCt: 452 NNNNNNNNWNNNNNNNNNNNNNNimNSNSSNTN 485 

Score » 32.5 bits (72), Expect - 3.1 

Identities - 24/110 (21%). Positives » 44/110 (39%), Gaps - 10/110 (9%) 

Query: 138 TDETSKQHATSLDKSTGOTANRNAYVSLPQSEWIOVDNTTLRFADNNTCDNaK 191 

T T++ + +S++N+ +++N N *• H + +N +-NN N 

Sbjct: 4 29 TTTTTSTSPSSlNNNEDISSNNNNlWNmJNNNinJmWMI^^ 488 

Query: 192 TVN^SNESNQNAKRNQNQKGKAKGTQFTKQYLIDKIDKAVDLRKK 237 

T N +SN 4M N N N N+ +N ♦ L XX 

SbjCt: 489 INNTrOTnraSNSNNNNNOTWSNSNSNSNlMNIMNNNNNW S3 6 

>gi|37S88S5|emb(CABlU40.1| (Z985SD predicted using hexExon; 

MAL3P6.U (PFC0760O, Hypothetical protein, lem 3395 aa 
(Plasmodium falciparum) 
Length - 3394 

Score - 46.5 bits (108), Expect » 2e-04 

Identities - 52/202 (25%), Positives « 96/202 (46%), Gaps • 32/202 (15%) 

FNEFVNDNKLTFYDDEFQFMQKMLKFD - KDVIAIVNEKVFKGFSUCDELSDL- - LFKKSF 77 
F ++ K T D+ H+K K D DV + NEK++ L + + KK 



TIMFLDREIHRQTVEAFGMQV ITVClTHEDYtMVWSSSEVEKYLQSQaFTEHNE 132 

♦H + IN* + + +QV IV + OY + S + + K + +N 



+SN D +NQH +++N+ + N+N N +++N + N *V 

MSSNKDY- NNQNNQNt ENNQWI ENNQN NQNIEN NQNIEMNQNN 820 



Query: 


21 


Sbjct j 


665 


Query: 


76 


SbjCt: 


722 


Query; 


133 


Sbjct: 


778 


Query: 


193 


Sbjct: 


621 



N +N++NQN + NQN ♦ HA 



Score - 33.6 bits (75). Expect • 1.4 

Identities ■ 46/221 (20%), Positives - 89/221 (39%), Gaps « 37/221 (16%) 

Query: 10 DFIKSELIKKGFNEFVNDNKLTFTODEFQFMQKMLKFDKDVIAIVNEKVFKGPSLKDEIjS 69 

O +K E K N + +L Y ♦ + M+K K + V K SL 

Sbjct: 367 DSLKIEYNKSKTNIQQLNEQLVNYKNFIKEMEKKYK QLWKNNSLFSITH 416 



Query: 70 DLLF KKS FTIHFLDHEINR QTVEA FGMQVITVCITH- - - EDYLNWYSSSEVEXYLQSQG 126 
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D+K+I+R+++ +++IH +D+L+V+Y + + U * 
Sbjct: 417 OF I N L KNSN 1 1 1 1 RRTSDMKQ I FXMYNLDIEHFNEQDHLSVIY IYEILYNTN 466 

Query: 127 FTE HNEDTTSNTDBTSNQNATS LDNSTGMTANRNAYVSL PQS EVN I DVDNTTLRFADNNT 186 

+N D +N D +N N ♦ +N+ N N N + +N ♦ 
Sbjct: 469 - DNNNNDNDNNNDNNN1WNNNNDIJNNNNNNDNNNN --NNNYNNIMW- M 512 

Query: 1B7 IDNGKTVNKSSNESHQMAKRNQNQKGNAKGTQFTKQYLIDM 227 

I +N + N +■+ + N + M N + N + + VST I+N 
Sbjct: 513 IENMNSGMHPNSNNLHNYRHNTNDENNLSSLKTSFRYK1NN 553 



Score =32.8 bits (73), Sxpect =2.4 

Identities » 28/122 (22%), Positives - 53/122 (42%), Gaps - 2/122 (1%) 

Query: 119 EKYUJSCCFTEHNEDTTSNTDETSNQNATSL^ 177 

E Y S + +++ N + *H + + DN+ N N ♦+ +N D ++N 

Sbjct: 2838 ENYPVSTHYDNNDDINKJN INNDNNNDN I NDDNNNDN I NNDNNNDN I NNDNI NNDN INND 2897 

Query: 178 TUlFADl^IDNGiaVNKSSNESNQNAKRNQNQKGNAKGTQFTKQYLlDNIDKAYDLRKK 2 37 

+N+ +NG SSN N NNKN+G + + + ♦ YD K 

Sbjct: 2898 NNNDNNNDNSNNGFVCELS5NINDFNN I LNVN - KDNFQG I NKSNN FSTNLSEYNYDAYVK 2956 

Query! 238 il 239 

1 + 

Sbjct: 2957 IV 2958 



Score - 32.5 bits (72), Expect - 3.1 

Identities - 46/249 (18%). Positives - 101/249 (40%). Gaps - 31/249 (12%) 

Query: 9 YDFIXSEI,IKKGETJEPvTroMKLTFYDDEFQFttQKMIJC^ 68 

Y+++K ++ N N NK E Q++ K+ ♦ + + +B K L++ 

Sbjct: 2150 YNYVK VQNATNREDNKHX ERKLSQEIYKYINENIDLTSELEKKNDMLENYK 2200 

Query: 69 SDL LFKXSFTIHFLDREINRQTVEAFGMQVITVCITHEDYI2IVVYSSSEVBKYL 122 

++L ++K ♦ I L + M+++ H ♦ B+ + L 

Sbjct: 2201 NEUC£X23EBI YKIJ1NDIDMLSNMGG<L!G2S IMMMEKYKI I MN NNIQEKDEIIENL 2255 

Query: 123 QSQGFTEHNEDTTSOTDETSNQKATSLDNSTGMTAN RNAYVSLPQSE VNIDV 174 

+<M . + +D +M ♦ ++S M+ + N + +L +S M+D+ 

Sbjct: 2256 KNX - YNNKLDDL I NNYS WDKS IVSCFEDSN I MSPSCND I LNVFNHLSKSNKKVCTNMD1 2314 

Query: 175 DNTTLRFADNNT I DNGKTVNKS SNESNQNAKHNQNQKGNAKGTQFTKQYL I DNIDKAYDL 234 

N + ++I+N +N +N +N N N N N K YL++N+ D 

Sbjct: 2315 CNENMDSI- -SSINNVNNINNVNNINNVNNINNVNN^ 2372 

Query: 235 RKKILNEFD 243 
1+ +P+ 

Sbjct: 2373 DNIIIIKPN 2381 



Score = 32.1 bits (71). Expect - 4.1 

Identities - 20/103 (19%), Positives - 48/103 (46%), Gaps » 2/103 (1%) 

Query: 115 SSBVEKYLQSQGFTEHHEOTTSNTDETSNQN - - ATSLDNSTGMTANRHAYVSLPQSEVH I 172 

+♦+ EKY EH + N D +N+N L ++ ++ + N S ++B+ 

Sbjct: 3264 NNDEEKYSCHDDKNKHTNNDLI^IDHEWKNNITDELYSTYNVSVSHNKDPSNKENEIQN 3323 

Query: 173 DWNTTIAFADKNTIDNG1CTVNKSSNESNQNAXRNQNQKGNAK 215 

♦ + D N ++ M ++E+++N ♦ ♦♦N ♦ + K 
Sbjct: 3324 LISIDSSHETOENDBNOENDENDENDENDENDEMDENDENDEK 3366 



Score « 30.9 bits (68), Expect • 9.2 

Identities - 27/118 (22%), Positives = 53/118 (44%), Gaps - 15/118 (12%) 

Query: 104 THBDYLNWYSSSBV EKYLQSQGFTEHNEDTTSNTDETSNQNATSLDNSTGMTANR 159 

T+ D LN+ + +++ BY HN+D ++ +E QH S+D+S N 

Sbjct: 3280 TNNDLI2JIDHDNNKNNITDBLYSTYNVSVSHNKDPSNKENEI - -QNLISIDSSNENDEND 3337 * 



Query: 160 NAYVSLPQSEVT* IDVDNTTLRFADNNTI DNGKTVNKS SNESNQNAKRNQNQKGNAKGT 217 

+++ N * D D N ++ N +E+++N ♦ ++M N +GT 

Sbjct: 3338 EN D END END END EN DENDENDENDENDEKDENDENDENDENFDNNNEGT 3386 
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>gi| 585795 |sp|P21538|REBl_YEAST DMA- BINDING PROTEIN REB1 (QBP) 

>gi|626l39jpir| |$4590? DNA-binding protein RE81 - yeast 
(Saccbaromycea cerevisiae) >gi | 536280) emb) CAA84992| 
(Z35918) ORF YBR04 9c (Saccharorayces cerevisiae] 
>gi J559944 | erab | CAA86391 | (Z46260) REB1 DNA-binding 
protein I Saccharorayces cerevisiae] 
Length - 810 

Score - 45.7 bits (106), Expect * 3e-04 

Identities - 34/158 (21%), Positives - 72/158 (45%), Gaps - 14/158 (8%) 

Query: 83 DREINRQTVBAFGMQVITVCITHEDYLNVVYSSSEVBCT 142 

D+ N+++VE ++ + V ♦ H+++ +++ K+ + Q E *• D N ♦+ S 

SbjCt : 7 DKNANQESVBEAVI^YVGVGLDHQNHDPQLirr^ 66 

Query: 143 NQNATSLDNSTGMTANRNAYVSLPQSEVNIDVDN7TLRFADNNTID NGKTVNKSSNE 199 

N+N + D+S ++A L +£ + +VD+ N +0 N+ +E 

Sbjct: 67 NRNEDNNDDSENISA LNANESSSNVDHANSNEQHNAVMDWYLRQTAHNQQDDE 119 

Query: 200 SNQNAKRNQNQKGNAKGTQFTKQYLIDN I DKAYDLRKK 237 

++N N GN F++ +♦ +D D KK 

Sbjct: 120 DDEN- - NNNTDNGNDSNNHPSQSO I V- - VDDDDDKNXK 1S3 

>gi | 172372 fM5B72fl) DMA-binding protein (Saccharomyces cerevisiae] 
Length - 809 

Score » 4S.7 bats U0S> , Expect - 3e-04 

Identities = 34/158 (21%). Positives = 72/1S8 (45%), Gaps - 14/158 (8%) 

Query: 83 DREINRQTVEAFGMQVITVCITHEDYIJftrWSSSEVE 142 

D+ N+++VB ++ + V + H+++ +♦+ K+ + Q B + D N S 

Sbjct: 7 DFMANQBS V^EAVIJCYVGVG LDHQNHD PQIJTTKDLENKHS KKQNI VESSND VHVNNNDDS 66 

Query: 143 NQNAT S LDNSTGMTANRNAYVS LPQS BVN I D VDNTT LRFADNNT I D - - - NGKTVNKSSNE 199 

N+N + D+9 ++A L +E +• +VD+ N +D N+ +E 

SbjCt: 67 NRNEDNNDDSENISA LNANESS SNVDHANSNEQHNAVMDWYLRQTAHNQQDO E 119 

Query: 200 SNQNAKRNQNQKGNAKGTQFTKQYLI DN I DKAYDLRKK 237 

++N N GN F+* • ++ +D D KK 
Sbjct: 120 DDEN- -NNNTDNGNDSNNKFSQSDIV- -VDDDDDKNKK 153 

>gi| 2952545 (AF05189B) coronin binding protein (Dictyosteliun 
discoideura] 
Length « 560 

Score » 44.9 bits (104), Expect - 6e-04 

Identities « 26/83 (31%), Positives » 39/83 (46%), Gaps » 5/83 (6%) 

Query: 131 NEDTTSNTDETSNQMATSLDNSTGOTMJRNAYVSLPQSEVNIDTO 190 

N + +N fNNtS +NS +N N+ + P N D DN T +NNT +N 
Sbjct: 404 NNNNNNNI INNNNSNSNSNNNSNN-NSNNNSNRNSPNHNNNGDNDNNT NNNTNNNN 4 58 

Query: 191 KTVNXSSNESNQNAKRNQNQKGN 213 

N ++N +N N (IN N 
SbjCt: 459 NNNNNNNNNNNNNNNNNNNNNNN 481 

Score a 41.4 bits (95), Expect » 0.006 

Identities - 22/88 (25%), Positives - 43/88 (48%), Gaps - 6/B8 (6%) 

Query: 130 HNEDTTSNTDETSNQNATSLDN - - - STGMTANRNA YVS LPQS EVNIDVDNTTLRFADNNT 186 

+ +N++ SN N+ + +N + G AN++ + P + +N + DN +NN 

Sbjct: 337 NRNNSNNNSNNNSNNNSNNSNNRNITNGSNANKS - - -NSPNNNUfTNNDNKNNNSNNNNN 393 

Query: 187 IDNGKTVNKSSNESNQNAKRNQNQKGNA 214 

+N S+N +N N N N N* 

Sbjct: 394 SNNMSNNGNSNNNNNNNI INNNNSNSNS 421 

Score - 40.6 bits (93), Expect * 0.011 

Identities - 24/101 (23%), Positives - 41/101 (39%), Gaps = 2/101 (1%) 

Query: 115 SSEVBKYMSQGFTEHNEDTTSNTOETSNQNATSLDNSTGMTANRNAYVSLPQSBVNIDV 174 
S+ L + ++N +N ++ N S +N + N N S + N ♦ 
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Sbjct: 370 SNS PNNNLhTTNNDNKNNNSNNNNNSNNNSNNGNSNNNNNNNI I NNNNSNSNSNNNSNNNS 429 

Query: 175 DNTTLRFADN- - NTIDNGKTVNKSSNESNQNAXRNQNQKGN 213 

♦N + R + N K DN N ++N +H N N N N 
Sbjct: 430 NNNSNRNS P^n^NNNGDNDNNT^IN^^^KNNNNNNN^^NNNNNNN 470 

Score • 40.2 bits (92), Expect = 0.014 

Identities = 21/80 (26%), Positives a 39/80 (48%), Gapa - 9/80 (11%) 

Query: 130 HNEDTTSOTDETSNQNATSI^NSTGmANRNAYVSLPQSEVNIDVDNTTLRFADNNTIDN 189 

+N D +NT* +N N + +N+ N N N + +N +ADN+ ++ 

Sbjct: 442 NNGDNDNNTNNNTNNNNNNNNNNNNNNNNNNN NNNNNNNNNNYADNSKNNS 4 92 

Query: 190 GKTVNKSSNESNQNAKRNQN 209 

+ N +SN +N N +N+N 
Sbjct: 493 SNSNNNNSNSNNNNDNXNEN S12 

Score - 39.5 bits (90). Expect - 0.024 

Identities - 26/111 (23%), Positives ■ 44/111 (39%), Gaps » 20/111 (18%) 

Query: 112 VYSSSEVEKYLQSQ--GFTEHNEDTTSFn^CTSNQNATSLDNSTGMTANRNAYVSLPQSE 169 

VY + K+ ++ G +N ++ +N++ SN N ++N N N 
Sbjct: 296 VYCTHHHTKFYETHRNGLLNNNNNSNNNSNSNSNNNN^ 346 

Query: 170 VNIDVDNTTLRFADNNTIDNGKTVNKSS NESNQNAKRNQNQKGNA 214 

+• N ++N I NG NKS+ N +N N M N M+ 

Sbjct: 347 ---^SNNNSNNSNNRNITNGSNANKSNSPNNNL»miNDNra^ 394 

Score - 37.5 bits (95), Expect » 0.094 

Identities » 24/96 (25%), Positives « 41/96 (42%), Gaps • 1/96 (1%) 

Query: 124 SQGFTEHNEiyiTSNTDETSNQNATSU3NSTGM-TANRNAWSLPQSEVNIDVDtnTLRPA 182 

S + +N + SN + +♦ N DN+T T'N N + + N ♦ +N 
Sbjct: 421 SNNNSNNNSNNNSNRNSPNHNNNGDNDNNTNNNTNNNNNN^ 4 80 

Query: 183 DNNTIDNGKTVKKSSNESNQNAKRNQNQKGNAXGTQ 218 

+NN DN + + SN +N N+ N + K Q 
Sbjct: 481 NNNYADNSNNNS SNSNKHN SNSNNNNDNKNENSDNQ 516 

Score - 35.6 bits (80), Expect » 0.36 

Identities - 25/99 (25%), Positives - 42/99 (42%), Gaps - 18/99 (18%) 

Query: 130 HNEI)TTSNTDETSNQNATSUDNST-G^^'A^mNAYVSLPQSEVNIDVD^^^^lJ^FAENNTID 188 

+N + SN + +N N ++ NTS AN++ + P + +N + DN +NN + 

Sbjct: 339 NNSNNNSNNNSNNNSNNSNNRNITNGSNANKS NS PNNNIHTNNDNKNNNSNNNNNSN 395 

Query: 189 NGKTV NKSSNESNQNAKRNQNQKGN 213 

N N S++ SN K+ N N N 

Sbjct: 396 NNSNNGNSNNNNNNNIINNNNSNSNSNNNSNNNSNNNSN 434 

Score - 35.2 bits (79). Expect - 0.47 

Identities - 21/94 (22%). Positives > 42/94 (44%), Gaps - 5/94 (5%) 

Query: 124 SGjGETEHNEinTSNTDETSNQNATSIiDNSTGMTANRNAyVSLPQSBVNIDvBNTTLRFAD 183 

+ G + ++ +N T+N N + N+ N N+ + N ♦ +N + + 

Sbjct: 362 TNGSNANKSNS PNNNLNTNNDNKNNNSNN NNNSNNNSNNGNSNNNNNNNI INNNN 416 

Query: 184 NNTIDNGlcrVNKSSNESNQNAKRNQNQKGNAKGT 217 

+N+ N + N S+N SN+N+ + N N T 
Sbjct: 417 SNSNSNNNSNNNSNNNSNRNSPNHNNNGDNDNNT 4 50 

Score - 35.2 bits (79), Expect = 0.47 

Identities = 29/118 (24%), Positives • 53/118 (44%), GapB « 12/118 (10%) 

Query: 115 SSEVEKYLQS-QGFTEHNEDTTSNTDETSNQNATSLDNSTGMTANRNAYVSLPQSEVNID 173 

SS+ E ♦+ +GF + + T+N ++N D S+G + + + V+ P+S +N 

Sbjct: 114 SSDSBADIEDDKGFQD--KPITTNNSGSNNPLKNLKDYSSGSSGSSRSGVNQPRSNINNS 171 

Query: 174 VDNTTLRFADNNT - - - IDNGKTVNKSSNESNQNAKRNQNQKGNAKGTQPTKQ 222 

D + + *N+ I ♦ T + NQN +NQNQ N Q +Q 
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SbjCt: 172 NDKYKSKSSSSNSNSSSSGGSLISSLLTGGNTYQNQNQMQNQNQNQNNNQSQLQQQQQ 22 9 
Score « 34.4 bite (77), Expect • 0.81 

Identities - 24/94 (25%), Positives =* 38/94 (39%). Caps - 12/94 (12%) 

Query: 131 NEDTTSNTDETSNQNATSLDNSTGOT"ANRNAYVSLPQSEVNIDvT1NTT^ 190 

N +T +N + +N N ♦ +N+ N N S N N +NN+ N 

SbjCt: 451 NNtnTJNlWNNNNNNNNNNNNNNNKNNN^ NNNSNSNN 504 

Query: 191 KTVNXSSNESNQNAKR NQNQKGNAKGTQ 218 

NK+ M NQ+ R ++NQK ♦ Q 

Sbjct: 505 NNDNKNENSDNQSVLRSNEKFTDENQKNGSDDQQ 538 

Score »33.6 bits (75), Expect - 1.4 

Identities - 22/90 (24%), Positives - 35/90 (38%) 

Query: 124 SQGFTEHNEDTTSNTDETSNQNATS LDNSTGMTANRNAYVSLPQS BVNIDVDNTTLRFAD 183 

S N SN N N+ N N ♦ + N + +N 

SbjCt: 353 SNNSNNRNI TNGSNANKSNS PNNNI^nTTODNKlJNNSNNNNNSNNNSNNGNSlJNroiNNNI I 412 

Query: 184 NNTIDNGKTVNKSSNESNQNAKBUQNQKGN 213 

NH N ♦ N S+N SN N+ RN N 
Sbjct: 413 NNNNSNSNSNNNSNNNSNNNSNRNS FNHNN 442 

>gi| 535260 |erab|CAA82996| (230339) STARP antigen (Plasmodium 
reichenowi] 
Length - 655 

Score - 44.5 bits (103), Expect - 7e-04 

Identities > 31/114 (27%), Positives - 47/114 (41%), Gaps - 14/114 (12%) 

Query: 128 TEHNEDTTSNTD8TSNQNATSLDNSTGMTANRNAYVSLPQSEVN IDVDNTTLRF 181 

T++M T TD + ♦ +N+T AN + ♦+ N D +NT ♦ 

SbjCt: 433 TDNNNTNTKATDSNNTNTKATDNNNTTn'KATO 492 

Query: 182 ADNNTI DNGKTVNKS SNESNQNAXRNQNQ KGNAKGT QFTKQYLIDN 227 

DNN DN T K+++ +NNK N NKT T QY+ N 

SbjCt: 493 TDNNWTNTKATDNNNTNTXATDNNN^ 546 

Score - 44.5 bits (103), Expect « 7e-04 

Identities - 30/103 (29%), Positives - 44/103 (42%), Gaps - 13/103 (12%) 

Query: 128 TEHNEDTTSNTDETSNQNATSLDNS TGMTANRNAYVS LPQS EVN IDVDNTTL 179 

T++N T TD+++N + ♦ DN+ T T N N S D +NT 

Sbjct j 401 TDNNNTirrKATDKSNNTDTKATDNNNNTDTlCATO 460 

Query: 180 RF ADNNTI DNGKTVNKSSNESNQNAKRNQNQKGNAKGT 217 

+ DNN DN T K+++ +N N K N NKT 

SbjCt: 461 KATDNNOTNTKATDNNNTNTKATDNNNTNTKAT 503 

Score t. 42.6 bits (96), Expect • 0.003 

Identities - 27/96 (28%), Positives « 43/96 (44%), Gaps - 10/96 (10%) 

Query: 128 TEHNBDTTSNTDETSNQNATSLD - NSTGMTANRNAYVSL PQS EVN IDVDNTTLRF ADNNT 186 

T++N +T + + +N N ♦ D N+T AN + ++ N NT + DNN 
SbjCt: 422 TDNNNNTDTKATDNNNTNTKATDSNNTNTKATDNNNTNTXATDNN NTNTKATDNNN 477 

Query: 187 I DNGKTVNKSSNESNQNAKRNQNQKGNAKGT 217 

ON T K+++ +N N K N NKT 
Sbjct: 478 TNTXATDNNNTNTKATDNNNTNTKATDNNNTNTKAT 513 

Score a 41.8 bits (96), Expect » 0.005 

Identities - 35/150 (23%), Positives « 59/150 (39%), Gaps » 9/1S0 (6%) 

Query: 85 EINRQTVEAFGMQVITVCITOEDYLNVVYSSSEVEKYI^ 144 

E N+ +♦ G T+ + N + E + +Q T +N TT+ + N 

Sbjct: 118 ETNKTNIKLTGNNSTTINTNLTENTNA- -TKKLTENVITNQILTGNNNTTTNTSSTEHNN 175 



Query: 145 NATSLDNSTGKTANRNAYVSLPQSEWIDVDNTTIjRFADNNTIDNGKTVNKSSNESNQNA 204 
N + NSTC T+ NI + N L *N T + T + +♦ +N N+ 
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Sbjct: 176 NINTNTWSTGNTSTTKKLTB NI - ITNQI LTGNN>TITTNTSSTEHNNNINTNTNS 228 

Query: 205 KRNQNQKGNAKGTQFTKQYLIDNIDKAYDL 234 

N N N T + DNI+ +L 

Sbjct: 229 TDNSfmn^LTDITTTTKKWTDNINTTQNL 258 

Score • 41.8 bits (96). Expect o 0.005 

Identities * 30/101 (29%), Positives - 43/101 (41%), Gaps « 13/101 (12%) 

Query: 130 HHEDTTSKTOBTSNQNATSIiDMS-TGMTANRMAYVSLPQSEVMIDV DNTTLRPA 182 

+N DT S ++ ++ AT DN+ T T N N * N D +NT ♦ 

Sbjct: 3 S3 NNTDT I STD^roNTI}TKATD^ro^^'DTKATD^n^N^m3TKATDNN^^T^TKATD KSNNTDTKAT 422 

Query: 183 DNN TIDNGCTVNKSSNESNQNAKRNQNQKGNAKGT 217 

DNN DN T K+++ +N N K N N K T 

sbjct-. 423 DmmimjTKXTDNmmrriavTDSNimrrKATDKimirntAT 463 
Score u 40.6 bits 193), Expect « O.oil 

Identities - 31/121 (25%). Positives • 47/121 (38%), Gaps - 31/121 (25%) 

Query: 128 TEHNEDTTSNTDBTSNQNAT SLDNSTGMTANRNAYVSLPQSEVN 171 

TJJHN + +NT+ T N + T ♦ +T N N ♦ +EN 

Sbjct: 171 TEHNNNINTNTNSTGNTSTTKKLTENI ITNQILTGNNNTTTNT S STEHNNN INTNTNSTD 230 

Query: 172 1 D VDNTT LRPADN NTIDNGKTVNKSSNESNQNAKRNQNQKGNAKG 216 

D+ TT +■+ ON T N TV* +S *N N K N N K 

Sbjct: 231 NSNTNTKLTDITTTTKXWTDNINTTQNLTC^ 290 

Query: 217 T 217 
T 

Sbjct: 291 T 291 
Score = 38.3 bits (87), Expect - 0.055 

Identities * 28/98 (28%). Positives - 41/98 (41%), Gaps - 10/98 (101) 

Query: 128 TEHHEDTTSNTDETSMQlATSLDNSTGMTAKIi^YVSItPQSEVMIDVD -NTTLRPADNNT 186 

TBHN ♦ +NT+ S N+ + N T +T + + U* NTT DNN 

Sbjct: 216 TEKNNNINTNTN- - STDNSNTNTNLTD 1 TTTTKKWTDN INTTQNLTTSTNTTTVSTDNNN 273 



Query: 187 IDNGKTVNKSSNESNQNAKRNQNQKGNAKGT 217 

PN T KS++ N K N+ + K T 
Sbjct: 274 NN INTKPTONNNTN I KSTDNYHTGTKETDNKUTD I KAT 311 

Score « 37.5 bits (85), Expect - 0.094 

Identities a 31/106 (29%), Positives - 45/106 (42%), Gaps - 18/106 (16%) 

Query: 128 TEHNEDTTSNTDETSNQN ATSLDNSTGMTANRNAYVSLPQSEVN IDVDN 176 

T++N +T +T T N N AT N+T AN ♦ ' N D +N 

Sbjct: 390 TDNNNNT--DTKATDNNNTDTKATDKSNNTOT 447 

Query: 177 TTLRPADNN TIDNGKTVNKSSNESNQNAKRNQNQKGNAKGT 217 

T + DNN DN T K+++ +N N K N N K T 

Sbjct: 448 TNTXATDNNNTNTKATDNNNTNTKATDNNNTOT 493 

Score « 35.2 bits <79). Expect * 0.47 

Identities . 24/109 (22%), Positives - 46/109 (42%). Gaps = 6/109 (5%) 

Query: 128 TEHNSQTTSNTDETSNQNATSLDNSTGMTANRNA YVS LPQSEVN IDVDNTTLRF 181 

T++N T TO + + +N+T A N + ++ N D +NT + 

Sbjct: 473 TDNNNThH'KATDNNNTNTKATDNNNTjn'KATDNNN^ S3 2 

Query: 182 ADNNT1DNGKTVNKSSNESNQNAKRNQNQKGNAKGTQFTKQYLIDNIDK 230 

DNN N + +B+ + K N++ M++ ♦ K +• +DK 

Sbjct: 533 TDNNNNTNQYVFANNYDETTSDDKIiNKDSCONSEEKENI KSMINAYLDK 581 



Score » 34.4 bits (77), Expect « 0.81 

Identities « 26/126 (20%), Positives - 46/126 (35%), Gaps - 7/126 (S%) 
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Query: 99 ITVCITHEDYLNVVySSSEVEKYLQSQGFTEHNEDTTSNTOETSNQNATSLDNSTGMTAN 158 

IT T+ + S + V S T +++ +N T N N ++ T 

Sbjct: 318 I TTDNTNTNV I STDNS KTWI S KDN S NTHT I STDNSKTNVI STDNNNTDT I STDNDNTDT 377 

Query: 159 RNAYVSLPQSEVNIDVDNTTLRFADNNTID NGKTVNKSSNESNQNAKRNQNQK 211 

* ♦+ + +NT ♦ DNN D N +N+N + KN 

Sbjct: 378 KATONDNTDTKATDNNNNTDTKATDNNNTDTKATOKSNNTOT 437 

Query: 212 GNAKGT 217 
N K T 

Sbjct: 438 TNTKAT 443 
Score - 34.4 bits (77) , Expect - 0.81 

Identities = 30/100 (30%), Positives « 44/100 (44%), Gape - 14/100 (14%) 

Query: 131 NEDTTSNTDETSMQKATSLDNS-TGMTANRNAy- - -V5LPQSEVNI - - -DVDNTTLRFAD 183 

N + T TO T N N S DNS T ♦ + N+ +S S+ N+ D +NT D 
Sbjct: 313 NNNITITTDNT - NTllVlSTDMSKTirVI SKDNSinW"! STDNSKTNVI STONNNTDTISTD 371 

Query: 184 NNTIDNGKTVNKSS NESNQNAXRNQNQKGNAKGT 217 

N+D T N ♦+ N+N+K N +KT 

Sbjct: 372 NDNTDTKATONDNTOTKATONNNNTDTKATONNNTOTKAT 411 

Score » 34.4 bits (77), Expect « 0.81 

Identities - 28/101 (27%), Positives - 41/101 (39%), Gaps ■ IS/101 (14%) 

Query: 131 NEDTTSNTDETSNQNATSLDNSTGMTA--NRNAYVSLPQSEVNIDV DNTTLRFA 182 

N DT + ++ +♦ AT +N+T ANN N D +NT + 

Sbjct: 374 NTDTKATONDNTDTKATONNNNTOTKATONNNTDTKATOKSNNTDTK^ 433 

Query: 183 DNNTIDNGK TVNKSSNESNQNAKRNQNQKGNAKGT 217 

DNN N K T K+++ +N N K N N K T 

Sbjct: 434 DNNN-TNTKATOSNNTNTKATONNNTNTKATDNNNTNTKAT 473 

Score - 32.5 bits (72), Expect - 3.1 

Identities -30/110 (27%), Positives - 40/110 (36%), Gaps - 23/110 (20%) 

Query: 131 NEDTTSNTDETSNQNATSLDNS TGMTANRNAYVS LPQS EVNIDVDNTTLRF 181 

N +TT N ++N S DN+ TTNN + + D NT +♦ 

Sbjct: 251 NINTTQNLTTSTNTTTVSTONNNNNINTKPTONNNTNIXSTON^ 310 

Query: 182 ADNNTI DNGKTVNKSSNESNQNAKRNQNQKGNAKGT 217 

DNN I DNKTS+SN+ NKNT 

Sbjct: 311 TONNNITITTONTNTNVISTONSKTNVISXDNSNTHTISTONSKTNVIST 360 

>gi|1429240|emb|CAA67659| (X99260) lower collar protein 
[Bacteriophage B103] 
Length - 293 

Score « 43.8 bits (101), Expect - 0.001 

Identities • 53/204 (25%), Positives - 79/204 (37%), Gaps o 42/204 (20%) 

Query: 56 BKVFKG FSLKDELSDLLPKKSFTIHFLD REINRQTVEAFGMQVITVCITHED 107 

BK+ KG F + + D <-+K F HP* RE I +T P ♦ T I + 

Sbjct: 26 EKIEKGRPKLFDPQYP IFDESYRKVPETHF X RNFYMREIG FETEGLFKFNLETWLI INMP 85 

Query: 108 YLNWYSSSBVBKY LQSQGFTEH NEDTT SNTDBTSNQNA 146 

Y N ++ S E+ KY h * G ++ N DTT SNT * NA 

Sbjct: 86 YFNKLFES - ELI KTOPLENTRLNTTGNICKNDTERNDNRDTTGSMKADGKSNTKTSDKT^ 144 

Query: 147 TSLDNSTGMTA NRNAYVSLPQSEVNIDVDN- - TTLRFADNNTIDNGKTVNKS 196 

T G T NR P S +N+ +♦ TL +A ♦ 1+ T NK 

Sbjct: 145 TGSS1G3DGKTTGSVTODNFNRKIDSDQPDSRLNLTTNDGQGTLBYA~ -SAIEENNTNNKR 202 

Query: 197 SNESNQNAKRNQNQKGNAKGTQPT 220 

+ N + + GT T 

Sbjct: 203 NTTGTNNVTSSAESESTGSGTSDT 226 



Query* pt| 110879 44AHJD0RP009 Phage 44AHJD ORF | 5744-64 96 | 2 1 
(250 letters) 
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>gi 1 2764981 |emb|CAA69021.1| (Y07739) N-acetylmuramoyl-L-alanine 
amidase (Staphylococcus phage Tvort) 
Length - 467 

Score » 180 bits (452) , Expect - le-44 

Identities « 89/157 (56%), Positives - 109/157 <6B%) , Gaps - 8/157 (5%) 

Query: 1 MXSQTOAKEWIYXHEGAGVDFDGAYGFQC^^ 60 

MK+ +QA+ +1 G DFDG YG+QCMDL+V Y+Y++TDGK+RMWGNAKDAINN F 

Sbjct: 1 MXTLKQAESYIKSKVimmSFDGLYGYQCMDLAV^ 60 

Query: 61 GLATVYKNTPSFKPQLGDVAVYTNGQ- - - YGHIQCVLS GNLDYYTCLEQNWLGGGF 113 

G ATVYKN P+F+P+ GDV V+T G YGHI V ♦ G+L Y T LEQNW G G 
Sbjct: 61 GTATVYKNYPAFRPKYGDVVWITGNFATYGHIAIVTNPDPYGDIiQYVTVLEQN^ 120 

Query: 114 DGWEKATIRTHYYDGVTHFIRPKFSGSNS - KALETSK 149 

E ATIRTH Y G+TKFIRP F* +S K +T K 
Sbjct: 121 YKTELATIRTHDYTGITHFIRPKFATESSVKKXDTKK 157 

Score « 61.7 bits (147), Expect * 6e-09 

Identities - 41/12S (32%) , Positives » 57/125 (44%) , Gaps - 8/125 (6%) 

Query: 125 YYMVTHFIRPKFSGSNSKALETSiCVNTFGPCWKRNQYGTYYRNEKGTFTC-GFLPIFARV 183 

YY+G T P +K ♦ +T G W N YGTYY++B+ TP C I R 

Sbjct: 346 YYEGKTPV- - PTWNQKAJCTKPVKQSSTSG - MKVNNYGTYYKSESATFKCTARQGI VTRY 402 

Query: 184 GSPKLSEPNGT^FQPNGYTPYKEVCl^DG Y'WI GYN^JQGTR - YYLPVRQWHGlCrGNSYSV 242 

P * P Y+ VC DGYVWI ♦ G + -f+PVR W+ N+ + 

Sbjct: 403 TGPFTTCPQAGVXYYGQSVTYDTVCXQDGYWIS 459 

Query: 243 GIPWG 247 

G WG 
Sbjct: 460 GQLWG 464 

>gi|H367S|sp|P24556|ALYS STAAU AUTOLYSIN 

{N-ACETYLMURAMOYL-L- ALANINE AMIDASE) 

>gil79887|pir| )JQ1147 N-acetylrauraracyl-L-alanine amidase 
(EC 3.5.1.28) - Staphylococcus aureus >gi| 153067 
(M76714) peptidoglycan hydrolase [Staphylococcus aureus] 
Length ° 481 

Score . na bite (292), Expect - 6e-26 

Identities « 56/117 (47%), Positives « 68/117 (57%), Gaps - 1/117 (o%) 

Query: 135 PKFSGSNSKALETSKVNTFGK-WtRHQYGTYYRKIMGTFTCGFLPI^ 193 

P ♦ SN ♦ V WXRN+YGTYY E+FTGPIR PLSPG 

Sbjct: 365 PVATVSNESSASSNTVKPVASAVnOWKYGTYYMEBSARFTNG^QPITVRKVGPFl^CPVG 424 

Query: 194 YlffFQPNGYTPYNEVCLSEGYVWIGYNWQOTRYYLPVRQWN^ 250 

Y FQP GY Y EV L DG+VW+GY W+G RYYLP+R WNG + +G WG S 
Sbjct: 425 YQ FQ PGGYCD YTEVMLQDGHVWVGYTWEGQRYYL PI RTWNGSAPPNQ I LGDLNGEI S 481 

score - 78.0 bits (189), Expect - 7e-l4 

Identities - 48/109 (44%), Positives • 62/109 (56%), Gaps - 6/109 (5%) 

Query: 15 EGAGVDFIX^YGFGCMDLSVAYVYYITDGiCVRMWaNAKDA- INNDFKGLATVYKNTPSFK 73 

EG + D YGFQC D + A ♦ + G ♦ AKD N+F GLATVY+NTP F 

Sbjct: 18 EGKQFNVDLWYG FQCFD YANAG - WKVL FGLLLKGLGAXD I PFANNFDGLATVYQNT PD FL 76 

Query: 74 PQLG0VAVYTNGQ YGHXQCVLSGNLDYYTCLEQNWLGGGP - DG WE JC 118 

Q GD+ V+ + YGH+ V+ LDY EQNWLGGG+ DG E+ 

Sbjct: 77 AQPGDMVVPGSMYGAGYGHVAWVIEATLDYI IVYEQNWLGGGWTDGIEQ 12S 

>gi | 1763243 (072397) amidaBe [bacteriophage 8 0 alpha) 
Length - 481 

Score - 118 bits (292) . Expect - 6e-26 

Identities - 56/117 (47%), Positives » 68/117 (S7%), Gaps - 1/117 (0%) 

Query: 135 PKFSGSNSKAL ET S JCVNTFGK- WKRNQYGTYYRNENGTFTCC FL P I FARVG S PKLS E PNG 193 

P ♦ SN + V WKRN+YGTYY E+FTGPIR PLSPG 

Sbjct: 365 WATVSNESSASSNTVKPVASAWKRNKYGTYYMEESARFTNGNQPITVRICV 424 
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Query: 194 YWFQPNGYTPYNEVCLSDGYVWIGYNWQGTRYYLPVRQWNGKTGNSYSVGIPWGVFS 250 

V FQP GY Y EV L DG+VW+GY W+G RYYLP+R WNG + +G WG S 
Sbjct: 425 YQFQPGGYCDYTBVMLQDGHVWVGYTWEGQRYYLPIRTWNGSAPPNQILGDLWGEIS 481 



Score » 83.5 bits (203). Expect « 2e-lS 

Idencitiea - SO/115 (43%). Positives - 65/115 (56%), Gaps « 6/115 (5%) 

Query: 9 EWIYKHEGAGVT)E^GAYGFQCMDLSVAYVYYITIX3KvTlMWGNAiOA-INNDFKGLATVYK 67 

EW+ EG + D YGFQC D + A + + G + AKD N+F GLATVY+ 

Sbjct: 12 EWLKTSEGKQFNVDLWYGFQCFDYANAG - WKVTiFGLLLKGLGAKDIPFANNFDGLATVYQ 70 

Query: 68 OTPSFKPQLGDVAVYTOGQ---YGHIQCVLSGNIJDYYTCLEQNWLGGGF-DGWEK 118 

NTP F Q GD+ V+ + YGH+ V+ LDY EQNWLGGG+ DO E+ 

Sbjct: 71 NTPDFLAQPGDMVVFGSNYGAGYGHVAWV1BATLDYIIVYEQNWLGGGWTDGIEQ 125 



>gi|4S74237|gb|AAD23962.l|AFi0685l_l (AF106851) LytN (Staphylococcus 
aureus] 
Length - 383 

Score » 84.3 bits (205). Expect = 9e-16 

Identities - 48/128 (37%). Positives » 68/12B (52%), Gaps • 7/128 (5%) 

Query: IS EGAGTOET)GAYGFQCMDLSVAYvYYITDGKvTtMWGMAKLM 74 

E G DFDG+YG+QC DL Y ++ ++ +G N+F A +Y NTP+FK 

Sbjct: 2S2 ENRGWDFIXJSYGWQCFDLVNVYWNHLYGHGLKGYGAKD Z PYANNFNSEAKI YHNTPTFKA 311 

Query: 75 QLGDVAVYT NGQYGHIQCVLSGNLD YYTCUSQNWLGGGFIXWEKATIRTHYYD 127 

+ GD+ V++ G YGK VL+G+ D + L+QNW GG+ E A H Y+ 

Sbjct: 312 EPGDLVWSGRFGGGYGOTAIVT^GDYDGKI/IKFQSLDQNWNHGGWRK 371 

Query: 128 GVTHFIRP 135 
FIRP 

Sbjct: 372 NDMIFIRP 379 



>gi | 3767593 |dbj|BAA33856.i| (AB015195) LytN (Staphylococcus aureus] 
Length « 383 

Score » 84.3 bits (205), Expect - 9e-l6 

Identities ■ 48/128 (37%), Positives - 68/128 (52%), Gaps » 7/128 (5%) 

Query: 15 EGAGVDFTK^YGFQCMDLSVAYVYYITDGKVRMWG^ 74 

B G DFDG+YG+QC DL Y ++ +♦ +G N+F A +Y NTP+FK 

Sbjct: 252 BNRGWDFDGS YGWQCFDLVNVYWNHLYGHGLKCYGAKDI PYANNFNS BAKI YHNTPTFKA 311 

Query: 75 QLGDVAVYT NGQYGHIQCVLSGNLD YYTCLEQNWLGGGFDGHEKAT1RTHYYD 127 

+ GD+ V++ G YGH VL+G+ D + L+QNW GG+ B A H Y+ 

Sbjct; 312 EPGDLVVFSGRFGGGYGHTAIVLNGDYDGJCLMKFQSLDQNWNNG^^ 371 

Query: 128 GVTHFIRP 135 
FIRP 

Sbjct: 372 NDMIFIRP 379 



>gi| 2764983 |emb|CAA69022.l| (Y07740) cell wall hydrolase Plyl87 
(Staphylococcus phage 187] 
Length « 626 

Score - 76.9 bits (186). Expect - 2e-13 

Identities « 50/144 (34%), Positives ° 68/144 (46%). Gaps - 18/144 (12%) 

Query: 5 Q£AK£VJIYKH£GAGVDFIX3AYGFQCMDLSVAYVYYITDGKVRMW GNAKDAINNDF 5 9 

+Q +W G+GVD DG YG QC DL Y++ R W GNA+D + 

Sbjct r 12 KQWDWA I NL I G SGVDVDG YYGRQCWD LP - NY I FN RYWNFKTPGNARDMAWYRY 64 

Query: 60 KGLATVYKNTPSFKPQLGDVAVYTNGQY GHIQCVLS-GNLDYYTCLEQNWLGGGF 113 

V++NT F P+ GD+AV+T G Y GH V+ Y+ + +QNW 

Sbjct: 65 PEGFXVFRNTSDFVPKFGDIAVWIX3GNYNWNTWGHTGIW 124 

Query: 114 DGWEKATIRTHYYDGVTHFIRPKF 137 

A H Y GVTHF+RP + 
Sbjct: 125 YVGSPAAKIKHSYFGVTHFVRPAY 148 
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>gi| 3287732 |sp|O0S156|ALEl_STACP GLYCYL- GLYCINE ENDOPEPTIDASE ALE- 1 
PRECURSOR >gi | 1890068 jdbj | BAA13069 | (086328) ALE - 1 
[Staphylococcus capitis J 
Length -36 2 

Score - 73.4 bits (177), Expect - 2e-12 

Identities « 47/117 (40%), Positives - 61/117 (51%), Gaps = 10/117 (8%) 

Query: 132 FIRPKrSGSNSKALETSICVNTFGKWKRNQYGTYYJWENGTFTCGFLPIFARVGSPKLSEP 191 

F++ GSNS TS N G +K N+YGT Y+fB+ +FT I R+ P S P 

Sbjct: 2S2 FLKSAGYGSNS TSSSNNNG-YKTNKYGTLYKSESASFTAN-TDI1TRLTGPFRSMP 305 

Query: 192 NG YWFQPNGYTPYNEVCLSDGYVWI G YNW - QGTRYYLP VRQWNGKTGNS YSVG I PWO 24 7 

+ Y+EV DG+VW+GYN G R YLPVR WN TG +G WG 
SbjCt: 306 QSGVIJUCGLTIKYDBVMKQDGHVWVGYNTNSGKRVYLPVRTWN8STG ELGPLWG 359 

>gi|79926|pir| (A2S881 lysostaphin precursor - Staphylococcus 

simulans >gij 153047 (M15686) lysostaphin (ttg start 
codon) (Staphylococcus simulans} 
Length - 389 

Score - 69.5 bits (167), Expect « 3e-ll 

Identities - 48/133 (36%), Positives - 62/133 (46%), Gaps - 20/133 (15%) 

Query*. 131 HFIRPKFSGSNSKALETS KVNTFG K VnCRNQYGTYYRNENGTFTCG 175 

HF R S SNS A + K +GK WK N+YGT Y++E+ +FT 

SbjCt: 258 HPQRMVNSPSNSTAQDPMPFLKSAGYG1CAGGTVTPTPNTGWKTNKYGTLYKSESASPTPN 317 

Query: 176 FLPIFARVGSPKIiSEPMGYWFQPNGYTPYNEVCLSDGYVWiaYNW-QGTRYYLPVRQWNG 234 

I R P S P ♦ Y+EV DG+VW+GY G R YLPVR WN 

Sbjct: 318 - TD t ITRTTGPFRSMPQSGVLXAGQTIHYDEVMKQDGKVWVGYTGNSGQRI YLPVRTWNK 376 

Query: 235 KTGNSYSVGIPWG 247 

T ++G+ WG 
Sbjct: 377 STN TLGVLWG 386 

>gijl26496|splP10548|LSTP_STAST LYSOSTAPHIN PRECURSOR 

(GLYCYL -GLYCINE ENDOPEPTIDASE) >gi | 79927 | pir | | S01079 
lysostaphin precursor - Staphylococcus simulans bv. 
staphylolyticua >gi| 581744 |emb|CAA294 94 | (X06121) 
lysostaphin (AA 1-480) (Staphylococcus simulans bv. 
staphylolyticus] 
Length - 480 

Score - 69.5 bits (167), Expect - 3e-ll 

Identities - 48/133 (36%), Positives - 62/133 (46%), Gaps - 20/133 (15%) 

Query: 131 HFIRPKFSGSNSXALETS KVNTFGK WKRNQYGTYYRNBNGTFTCG 175 

HP R S SNS A + K +GK WK N+YGT Y++B+ +FT 

SbjCt: 349 HFQRJIVMSFSNSTAQDPMPFLKSAGYGICAGGTVTPTPN 4 08 

Query: 176 FLPIFARVGSPiaSBPNGYWFQPNGYTPYNEVCLSDGYVWIGYNW-QGTRYYLPVRQWNG 234 

I R P S P + Y+EV DG+VW+GY G R YLPVR WN 

SbjCt: 409 -TDIITRTTGPraSMPQSGVLXAGQTIHYDEVHKQDGHVWVGYT 467 

Query: 235 KTGNSYSVGIPWG 247 

T ++Q+ WG 
SbjCt: 468 STN TLGVLWG 477 

>gi|32B7967|sp|P10547[LSTP STASI LYSOSTAPHIN PRECURSOR 

(GLYCYL-GLYCINE~ENDOPEPTIDASE> >gi| 2072411 (U66B83) 
lysostaphin (Staphylococcus simulans] 
Length - 493 

Score - 69.5 bits (167), Expecc - 3e-ll 

Identities - 48/133 (36%), Positives • 62/133 (46%), Gaps - 20/133 (15%) 

Query: 131 HFIRPKFSGSNSKALETS KVNTFGK WKRNQYGTYYRNBNGTFTCG 175 

HP R S SNS A + K +GK WK N+YGT Y++B+ +FT 

SbjCt: 362 HFQRMVNSFSNSTAQDPMPFLKSAGYGKAGGTVTPTPNTGWKTNKYGTLYKSESASFTPN 4 21 

Query: 176 FLPIFARVGSPKLSEPNGYWFQPNGYTPYNEVCLSDGYVWIGYNW-QGTRYYLPVRQWNG 234 
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I R P S P + Y+EV DG+VW+GY G R YLPVR WN 

Sbjct: 422 -roilTRTTGPFRSMPQSGVLKAGQTIHYDEVMKQDGHVWVGYTGNSGQRIYLPVRTWNK 480 

Query: 23S KTGNSYSVGIPWG 247 

T ++G+ WG 
Sbjct: 4 81 STN- - -TLGVLWG 430 



>gi| 3341932 |dbj |BAA31998.l| (AB009866) amidaae (pepcidoglycan 
hydrolase) [bacteriophage phi PVLI 
Length « 484 

Score o 68.3 bits (164), Expect ■ 6e-ll 

Identities - 52/150 (34%), Positives - 71/150 (46%), Gaps - 17/150 (11%) 

Query: 3 SQQQAKEWIYTOIEGAGVDPIX3AYGFQCMDLSVAYVYYIT 62 

++ QA++W G ♦ D YGFQC D+++IG+R+G IDK 

Sbjct: 4 TJCNQAEKWFDNSLGKQFNPDLFYGFQCYDYASMF- FMIATGE-RLQGLYAYNIPFDNKAR 61 

Query: 63 ATVY KNTPSPKPQLGDVAVYTN- - -GQYGHIQCVLSGNLDYYTCLEQNWLGGGF- - 113 

Y KN SP PQ D+ V+ + G GH++ V S NL+ +T QNW G G+ 
Sbjct: 62 IEiaGQIIKNYDSFLPQKlJ)IVVFPSKYGGGAGHVSIVESANLNTFTSFGQNWNGKGWTO 121 



Query: 114 DGVJ- -EKATIRTHYYDGVTHFIRPKF 137 

GW E T HYYD +FIR P 
Sbjct: 122 GVAQPGWGPETVTRHVHYYDDPMYFIRLNF 1S1 



Query- ptjll0882 44AHJDORF012 Phage 44AKJD ORP |8391-8813|3 1 
(140 letters) 

>gijl40S28|sp|P2481l|YQXH_BACSU HYPOTHETICAL 15.7 KD PROTEIN IN 
SPOIIIC-CWLA INTERGENIC REGION (ORF2) 
>gi|322l89|pirj |B44816 orf2 5'of autolytic ami da a e - 
Bacillus subtil is >gi| 142801 (M59232) open reading frame 
2 (Bacillus SUbtilis) >gi | 1217874 |dbj |BAA069S9| (D32216) 
ORP121 [Bacillus subtilis] >gi | 1303767 | dbj |BAA12423 I 
(D84432) YqdD [Bacillus subtilis] 

>gi|263S036|embjCAB14S32| (Z99117) alternate gene name: 
yqdD; similar to holin (Bacillus subtilis) 
Length « 140 



Score - 80.4 bits (195), Expect - 6e-15 

Identities - 45/130 (34%). Positives - 67/130 (50%). Gaps - 3/130 (2%) 



Query: 


4 


VKFRFTDSEAFHMFIYAGDIiKLLYFLFVLWFVDIITGISKAIKNNNLWSKKSKRGFSKKX 


63 






♦ P D ++P G +K L L VL +D+VTG+ KA K L S+ + G+ +K 




Sbjct: 


8 


INFETLDLARVYLF GGVKYl^IXLVI^IIDVLTGVIKAWKFKKLRSRSAWFGYVRKL 


64 


Query: 


64 


XXXXXXXXXXXXXXXXXXKGGI^ITIFYYIANEGLSIVENCAEMDVLVPEQIKDKLRVI 


123 






G L T+ +YIANEGLSI EN A++ V +P I D+L+ I 




Sbjct $ 


65 


LNFPAVTLANVIiyiVIJniNGVLTFGTVLFYIANEGLS I TENLAQIGVKI PSS ITDRLQTI 


124 


Query: 


124 


KNDTBKSDNN 133 








+N+ E+S NN 




Sbjct : 


125 


ENEKEQSKNN 134 





>gi| 4126631 jdbj | BAA36651.1 | (AB016282) 0RP4S [bacteriophage phi-105] 
Length - 135 

Score ■ 76.1 bits (184), Expect » le-13 

Identities • 44/115 (381), Positives - 61/115 (52%), Gaps » 4/115 (3%) 

Query: 21 GDLKLLYFLFVLMF VD 1 1 TG I S KA I KNNNLW S KKSKRG FSKKyjOlXXXXXXXXXXXXXXX 80 

G++K L ♦ VL +DIITG+ KA K L S+- + G+ +K 
Sbjct: 17 GBVKYLDIJ1LVLNIIDIITOTIKAWKFI02LP^RSAWFG 76 

Query: 81 XKGGLLMITIFYYIANBGLSIVZNCAEMDVLVPEQIKEKLRVIKND TEKSD 131 

G L T+ +YIANEGLSI EN A++ V +P I D+L VI++D TEK D 
Sbjct: 77 LNGVLTFATVLFYIANEGLSITENLAQIGVKI PAVITDRLHVTESDNDQKTEKDD 131 



>gi| 141088 |sp|P2683S|YNGD_CLOPE HYPOTHETICAL 14 . 9 KD PROTEIN IN NAGH 
3'REGION (ORPdT >gi | 107S967|pir| | S43905 hypothetical 
protein D - Clostridium perfringens >gi| 455154 (M81878) 
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ORF D (Clostridium perfringens} 
Length .132 



Score » SO. 9 bits (145), Expect * 4e-09 

Identities - 38/127 (29%), Positives » 63/127 (48%), Gaps » 3/127 (2%) 

Query: 1 roEVKFRFTDSRAFHMFIY-AGDUCIXYFLFVLM^ 5 9 

+N +K+ +1+ A D+ L+ L V +F+D +TG+ K K* L S +RG 

Sbjct: 5 1HYIKWGIVSLGTLFTWIFGAWDIPLITLL-VFIFLDYLTGV1KGCKSKELCSM1GLRGI 63 

Query: 60 SKKXXXXXXXXXXXXXXXXXXXKGGLLMITI-FYYIANEGLSIVENCAEMDVLVPEQIKD 118 

+KK +1 ++YI NEG+SX+ENCA + V +PE++K 

Sbjct: 64 TKKGLlLVVXLVAVT<lJ)RIJiDMGTWMFRTLIAYFYIMNEGISILENCAAliGVPIPEKLKQ 123 

Query: 119 KLRVIKN 125 

L+ ♦ N 
Sbjct: 124 ALKQLNN 130 

>gi| 2293160 (AF008220) YtJcC (Bacillus subtilis] 

>gi| 263554 8 |emb|CA81S042| (Z99119) similar to autolytic 
amidase (Bacillus subtilis] 
Length » 134 

Score = 36.4 bits (82), Expect - 0.099 

Identities - 25/109 (22%), Positives » 41/109 (36%) 

Query: 17 FIYAGDIJa.LYFLFVI>lFVDIITGISKArKNNNLWSKKSMRGFSKKXXXXXXXXXXXXXX 76 

F+G LLM++I+K +LKK KK 

Sbjct: 20 FFFGCFQYSFT<ILLSLMAlEFISTn4KETlIHKIiSFKKvTARLvTaa»vTLALISV 79 

Query: 77 XXXXXKGGU^ITIFYYIANMLSIVBNCAEMDVLVPEQIKDKLRVIKN 125 

4<3 ♦ +1 +YI E ♦ IV + + + VP+ + D L +KN 
Sbjct: 80 QLLNTQGSIRDIAIMFYILYESVQIVVrASSLGIPVPQMLVDLLETLKN 128 

>gi|llB1973|emb|CAA87743.l| (Z47794) holin protein (Bacteriophage 
CP-1) 

Length • 134 
Score - 31.3 bits (69), Expect -3.3 

Identities - 27/88 (30%). Positives « 36/88 (40%), Gaps « 5/88 (5%) 

Query: 29 LFVU*FVDIITGISKAiraJNNLWSKKSMRG7S 86 

LF L+ D ITG KA K S K G +L 

Sbjct: 18 LFALIIiFDFITGFLKAWKWKVTOSWTGLKGVIKKTLTFIFYYPVAVFLTYIKAMAVGQIL 77 

Query: 87 MITIFYYIANEGLSIVENCAEMDVLVPE 114 

♦+ I Y A LSI+EN A M V +P+ 
Sbjct: 78 LVIINLYYA LSIMENLAVMGVFIPK 102 
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Table 21 

Phage 182 complete genome sequence. 17833 nucleotides. 

1 cagaacattg tcacaaaaca caaacataat aatgcatatt attgtttaca aatatgtaac ttcgcgatat 

71 aatatatttg taagttaaag gaggtgacaa aagaacaaac cacaaatgct ttagaaattg caaaaaccat 

141 cggaggaaaa ataatgaaat attcactaca acaaatagat gaaatcaaac caacaatttt cagaactaga 

211 tcaaaaaggc acgaactaga ggaattggtg gacgaagtaa acgacattgc caaagacccg gaggaaagat 

281 atcttttatc gttttattac acagaagaag aacgtetgcc tgaaattccc tctgcaagat caacagacta 

351 ttacaacgaa aagaccacaa atctgaaatc ggaaaecata tcactcgaaa aaagattaca aaaaccagta 

421 aaataattac acaaaaagct ttacaaatat aacacatcat gttatactaa aagagcagta agggaacgga 

491 aaacacccta cttcacacct caatcatcct catcaaaata caaaaggagg gaaaacaatg ggtcgaaaac 

561 taacgcaacg aaacgtaaca tcaaccaaag tagaattctc agaagttatc gtacaagatg gagcgccaac 

631 aattgtacca tgcgaaccag ttgccttaac aggaaaaccc tcagaagaaa aagctttatc agcgatcaaa 

701 cgtaaaaacc ctgacaaaaa cgtagttgta acaaatgttt cacatgaaac agcgctttac acaatgccag 

771 tcgacaaact tatcgagtta gcagacaaat caacacaagc ctaataaaaa caaaaccaaa acaaaacaga 

841 ggagattata atcatggaaa tcgtaaaaag cacatttgac acacaaacac cagaaggaat gttacaagta 

911 ctcaacgcca caaacggggc ttcaattccg ttacgtaacg caattggcga agtactagaa ttgaaagata 

9B1 ttctagttta cccagacgaa gtttctggtt ttggtggagc cgaaccatca caagcagaac tagtcgcctt 

1051 cttcacagaa gacggtaaaa cttatgcggg tgtatcagca gtagcaacaa aaccagctaa aaacctaatt 

1121 gatacgatga ctgctaaccc tgacatcaaa ccaaaaattt cttttgtcga aggaaaaeca aacggtggac 

1191 aaaaatttgt aaacctacaa gcggtttcac tgtagcataa aaatacagga atctagcaag ecacttagcg 

1261 aatctcgcta ggtggttttt attatgtttc tacattgagg tgtgtagaat tgaccgtaag aacaccaaag 

1331 aatgatagag ccaagttaga gaaaacctac ggcaaatcca acaaagctcg taaaaaacac aatcgtttaa 

1401 gacaaaaagg agttgaggaa aggcaacttc caactgttcc aacatcaaag aaaagaccca ttgactacgt 

1471 aaaatcaaca aatatgagtc gtagtgattc taacaagacg tcagacgagt tggtagactt tgcacaaccc 

1541 tacaacgaga attacatttt tgagatcaac aagcgaaatg ttgcaatctc aagagcgcaa atcaaagaag 

1611 cgcaaattaa aacagagcaa gctcaaaaag cgaaagaaga acactacaaa gagcttaaca aagttgaagc 

1681 taagaagecc acagaaaaca caattgtcac accaactatt ttaacagagt eaggtgccga cttacctttt 

1751 caagcaatac cagattttaa tattgacgct ttcacttctc cagaaggagt ccagtcttat ttagaaaata 

1821 caggaaaaca agacgaacaa tattttgacg aaagagacca acttcattac gacaatttca gacaagcgat 

1891 gtttactatt tccaatccag acgctgacga tactgttcgt ttacttgact caatggggct tgatceattt 

1961 atgaaaacac atgttagtaa cttcttagac acgaaccttg accacattta tgacgaagca gaagtacaac 

2031 agaaaaaaga acaagtttac agtaagatcg caaaagtgac cgagtctgaa acaggtggag aagtcccctc 

2101 acacaacccc acgaagaaca tcacaattaa ttcagaaaca ggagaagaac taegattaag aaatatactg 

2171 gcgactttga aacaacaact gatctcaacg attgtcgtgt atggtcgtgg ggcgtacgcg atatagacaa 

2241 cgttgacaat atgacgttcg gcttagaaat cgattccttt cccgagtggt gtaaaatgca aggcagcaca 

2311 gacacttatt tccacaacga aaaatttgac ggagagttta tgctttcatg gttattcaaa aatggtctca 

2381 aatggtgtaa agaagcaaaa gaagatcgaa cattctccac actcatatca aatatgggtc aatggtatgc 

24S1 tttggaaatt tgttgggaag ttaattacac aacaacaaaa ccaggcaaaa cgaaaaaaga gaaatctcga 

2S21 acaataattt acgatagcct taaaaaacac ccttttccag tgaaacaaat tgcagaagct tttaatcttc 

2591 ctataaaaaa aggcgaaata gatcatacaa aagaaagacc taccggttac aaaccaacaa aagatgaacg 

2661 ggagcattta aagaacgaca cccagattac ggcgatggca ccaaaaattc aaetcgacca aggactaact 

2731 cgaatgacta gaggaagcga cgctttaggc gaccacaaag atcggctaaa agctacacac ggaaaaccaa 

2801 ctcccaaaca atggtttcct attttgtctt tagggtttga taaagactta cgtaaagcat acaaaggcgg 

2871 cttcactcgg gtaaaeaaag tttttcaagg gaaagaaata ggtgacggca ttgtcttcga tgtcaactet 

2941 ttgtatccct ctcaaatgca cgtaagaccc ttaccatatg gaacacctct attctacgaa ggagaataca 

3011 aaccgaacaa cgactatccg ctgtacattc aaaatatcaa agtaagattc cgtctaaagg agggetatat 

3081 cccaaccatt caagttaagc aaagttcatt acccacccaa aacgaatatc ttgaatcaag cgtaaacaag 

3151 ccaggagctg acgaattaat cgatcttacc ctcacaaacg ctgacctaga attattcccc gaacaccacg 

3221 ataccctaga gatacattac acttacggat atatgttcaa agcttctcgt gacatgtcca aaggctggat 

3291 cgacaaatgg accgaagtaa agaacaccac cgaaggggct agaaaagcta acgccaaagg eacgttaaac 

3361 agcttgtatg gaaagttcgg aacaaacccc gacattacag gaaaagtgcc ttacatgggc gaggacggca 

3431 ttgttcgatt gaeactagga gaagaagaat taagagatcc tgtttatgtt ccgcttgcca gttttgcgac 

3501 ggcccggggt agatatacta ccattacaac cgcccaaaaa tgttttgatc gcattactta tcgtgataca 

3571 gacagcattc atccagtagg aacagaagtt ccagaagcaa tcgatcacct ggctgatcct aaaaaacttg 

3641 gtcatcgggg gcatgaaagc acatttcaac gagcaaaatt cattcggcag aaaacacacg tagaagaaat 

3711 tgatggcgaa ttaaatgtaa agtgtgctgg tatgccagat cgaataaaag agactgcaac ttttgacaat 

3781 cctgaagtcg gtttttcaag ccacggaaag ttgctaccta aaagaacaca aggtggcgtg gtattagtag 

3851 acacaatgtt tacaatcaaa taaggaggac caacaatgga actacacaaa gcaatgttta tcgtacgtga 

3921 tgaaggtact attgacggtt acgatactga acactatgta gatatetctt cacatgactt tgaagaaata 

3991 cacggaaaag aaacacgcga aattgaagca gtaacattag taaaaacagg aaatttaaaa aaataaatca 

4061 tttacatcct ttgcaaagta tggtaaaata ctcttgtgat agccgacaag agtcaaaccc ggcgagattg 

4131 ggcgaatgta cacgtgaaac atcgtgcgct cccgttaagt tacggacaca taaacgttct gaccgtcaac 

4201 caatcgcaaa aaccttctag gagtagcccc caaacgtggc cactcttttt tgcgttccac agaattatgt 

4271 cccacgcgaa acagttttta tggtataata gaatcaaaag gaggcggaga ttatggaaat taaagaacar 

4341 gaaccaattt caaatggtat tcttgaaagt gtcacagacg gtgaagcaag atcaaagatt gtagaacatc 

4411 ttgaagcatt gcgagaagac cacggagcaa caaccgaagc cttgacatca geaaatagca cacccgaaaa 

4481 gttaaagaaa gataacgaag cgttggttat tccaaactca aaattgttcc gagaacgagc gaccgtagaa 

4551 ccagcagaaa ataacgaacc agaaacagac cagaatacca cactagacga tteaggaaet caaggaggaa 

4621 aaaacacggc tgacaaaacc acagaacaag atgtccttcg tgccacaaac gtagaaacac cagtacaatt 

4 691 aacgactgct atttataata gttcatcatc tcttctccag gcgaacgtac ctatgccaaa cgcagacaac 
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4761 atcgaagcgg ctggtgcagg gatcacacgt 

4831 accgtattgg taaagtagtt atccgataca 

4 901 catgccttta ggtcgaacga ttgaagaaat 

4971 gagtctgtta caggggtatt taaacaggaa 

5041 aaggctacca caaacaaacg atccaagaag 

5111 tagtcecgtt gctggtgtaa tgaacgcttt 

5181 ttattaatag caaaccacca agaaaaagag 

5251 atgcaaaaga atttacccgt aagatcaaac 

5321 cgctcaagga gttaaaacat ctacctcaaa 

5391 accattgacg ttgacgtttt agcagcggca 

5461 etattgatga gtttcctaaa aaagaaggcg 

5531 atggtttatg atctacgaca aattgtacaa 

5601 tattggttgc accaceacca actatattct 

5671 caacaaaacc tgtcacaaaa gttgcttttg 

5741 taccgcattg acatttacac cagtagaagc 

5811 ttggctaagg caaccgtaaa acaaacagca 

5881 gtcaatcatc agtaacattc acagctaccg 

5951 ctaaggagga caattatggc aagaaggtat 

6021 cctacacaca cacaagatgg cttaaaactc 

6091 taacgagaat agagattgtt cttatcaaag 

6161 aaagacgcct tatatgcccg caactatccc 

6231 atgcctttgt tactgatatt gaatataaga 

6301 acaaacttac cgtttcgaca ccggtacacg 

6371 ccgaatggaa tacctttcat taacacaact 

6441 atgtaacaac ttttcatcct aacgatggag 

6511 tggagataag gaagataaat caggaggatc 

6581 cccatcaatt caagtgggga ggtaeacaaa 

6651 cgtttcttac aacgaaagaa ecttttceaa 

6721 accattcatt gtggatcacg cgaacaaaac 

6791 ccaacctacg ctagtgatcc aacaggaaca 

6861 cattcgtacc taaaagaact gatcttgtag 

6931 tgttaaggaa tcaaaaccat ctatgtatcc 

7001 atgactttaa gacctgaata tcttacaggt 

7071 ctaataaagc gatgatcgag ccgattgatt 

7141 caagatgtta accgataatg atcctaacga 

7211 ggaaacaaaa actccttgat tgcccaagag 

7281 gtgcaatgag tacaggagga gcgatctttt 

7351 catcatggga gcaggacaac aagcaaacaa 

7421 ggtaaagtgg cagatatcga aaatattcca 

7491 caggaaactt tcaaaactat tatcaactgc 

7561 ccgttacttc ceaacgtatg gcacaaagag 

7631 tggaacttca ttaaattaaa agaaccaaat 

7701 aacaaatttt tagtgcaggc gttacgctte 

7771 agatgcatag gaaggaggaa taagatgagt 

7841 cagcaaaaag cagaccttat ccaaatgaac 

7911 ttaccgtaga caactcacgc tccetacgtt 

7981 cctcgttatt cagaaattgc ttcacacact 

8051 tcacggtttg cgcaggggca gaagatggtc 

8121 cgaagcaatg catcacaaga gatatcctgt 

8191 atgttgtata ataatgacct gaaagtcccc 

8261 acataaacca gatatcacga gtgaatcgaa 

8331 gaaatacttc ccattgctac aagcttataa 

8401 gatatggagt ctgacgaatc ttttaatgta 

8471 cagaattgaa cgaagtatgg aatgaagtgt 

8541 tgcacgtgta caaacaccag aagtcttacc 

8611 aaaccaagaa aagagtcttg cgatcgtgta 

8681 tgaagtttag aacagacgcc gcccgacaat 

8751 tggagggttg ccaagtgcta cetaaacgtt 

8821 aaaagaacgt attgaagttg gccgaaaaca 

8891 cgagcagaat ecgaaacaaa atttatcaat 

8961 catttaagtt taatctcgac gaatatttaa 

9031 tcttgaagag tttccgattt ttgatgacat 

9101 attgaeacaa acatcaaagc gaatcgtgat 

9171 acagaaacaa aaatacacgt gacacaggaa 

9241 tcaaaaagat ttgagaattg ccagcaatgg 

9311 gaagattrga gtaaagaaac aacaagctcc 

9381 cacgaagcaa cgcttctgaa aaagaaacaa 

9451 tacgattaca cgacataaag gtaaaaaggg 

9521 agtgttttga gaattgagaa aatgatcttt 

9591 gagggaggca gcaacaatgg tagattttaa 

9661 gaacgcttta gcaaatatcc tcatactgaa 

9731 taattgccta tctgaatgaa gccggtgctt 

9801 acattttgct gagaagttag aagagaccac 

9871 gaaaatttaa tcaatgatac tgtttttgca 

9941 ctgaaacacg tgccaacagt gcgaatattc 

10011 acttcggtac aagatccaac gcgacaatac 
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tcagacgcag taaaaaacga atttattcca acettagttg 
aaccctggcg taaccctttg aaaatgttta aaaaaggaaa 
tcttgctgac actgcacagg aacacaagtc caaccctgac 
gttcccgatg caaaaacatt gccccacgaa aetaatcgtg 
cacggtcaga aaaagcattt acttcacggg ataatttcaa 
atacacaggt gacgaagtaa gcgaatttga atacacgaaa 
ctattcaaag agatcgaaat tggcgaaatt accgaatcaa 
caacctetaa caaactagaa tttatgagct ccgcttacaa 
atctgatcaa tacgttatta ttgacgccga cacagacgca 
ctcaacatga gcaaaactga cttcgtagga cacaaaatcg 
aagaatcgtc aaatactgtg gcagctattg tagatagtga 
aacaacaagt ctataeaacc ctgaagggct atattggaat 
actcctcaat tcgggaacgc tgttgctttt gteaaatcag 
caagcgcaac aactagtgtt gccaaaggat cacctaaaga 
aacaaaccaa caaggagaag ttgtttcatc age ac cage a 
ggtaaagega ctgccgtaac egtagaagge ttagaagtcg 
gaggecaaca ageaaeggtt cttgttacgg ttacttctga 
acaaaegtaa aattgctggc taacgtgcct tttgataaca 
aacaggaaca ggaategtae ttcaattege ttcctgttct 
ggacacacaa ctegggggag tttttagagt agataaacac 
atctttaaaa acgaagaaac ttatcctagt aaatggcagt 
acgacaacac aagtttcgtt acctttgaaa ttgacgtttt 
agaaagtttc attgeaaaag aacaccctca actttattat 
gaagagtege ttgattaegg tagagaatac acaacaacaa 
tcaactttct tgttactcta acaagtgaag caatgecagt 
aatagtaggc ggcccatctc ctttttccta ttatttactt 
ccaaatgggg caggcaacgc taattttgga gagtacaegg 
ataagacagt egggaegtat gtaacgtcgt atacaggtat 
ggtaaggtat aatgeaggag gttcttataa gatcatgect 
acgaaaacac tegctttett ttgtgtaaaa gaagcaagaa 
ggaacgtgta taactacttt agagaagctt ttccgtttaa 
ctattgttta atagaaatta cagatacaaa aggacatgta 
ggcaaattga gtgtatatgt aaaaggttcg ttaggaattt 
atgatgtaag taactcaacc attattacca atttaagtga 
cgtaggagtt aaatctgact atgctcctgc attcatgeaa 
caaaacaccc gcaatacttt cagacaeggt atgggaaaca 
cagccttagc aagtaacaac ccxtttgttg gtttgactaa 
ccatgtttct gaaaaagaaa aeggtctgaa ccccttggca 
gataatgeaa cacagcttgg atcaaactca tctttcacaa 
gcttcaaaca aattaaatat gagtatgeaa caagacttga 
caatcgagta gcCacaccaa acttacaaac aagaaaagca 
attgtaggca caatgagtaa cgatgtatta acaegtgega 
ggcatacgaa tgatgttttg aattataacc aagacaaegg 
agacgaaaag gegcaggact tgctagaaat aacegttata 
cctacccaag tgatgtagaa gaaatcagct actatgaaca 
teagtcgett gaacgggaaa atttgecaaa atcaattgac 
aacggttatc ttggtttctt taaagaccct acacccgggt 
aaatcgatca ttatcacaac cctattttct ttacagcaaa 
tttaagatat gatgatgatg atgataaatc aaaatgtatc 
acgccaccaa gtttacatcg ttttgectta gatatggegg 
gagegcaaaa aacacctgta actattcaaa ctgatgaaaa 
ccaaattgac gaaaacaaCc aggctgtttt tgtggataaa 
tggcaaacaa atgctccaca tgtagtagat aaactacgat 
taacctttct aggtaecaac aacgccaacg tagacaagac 
taacaatgaa cagattgaaa gttcaggtaa catcttgtta 
aatcgtgtct ttggcgatga acttgaegga aagattgacg 
eacaaccggc ggcaggtcaa tcaaaaaaag accagatgag 
atattgaaag cttcactcat taccaacccg aatcacctcg 
attgtttgat tttgattatc cgttttatga cgaaacaaaa 
cacttttact tgagagagat aggctcagaa acgatgggat 
atccaaacac gecctattgg aataaaatgt tcctaccaaa 
ggactacacc attgatgaga aacagaaatt gttaaatgag 
gaatcgaaga accaaacgaa gcaagtagat caaacagaca 
caaccgattc tccctcaagg aacacttata cagacacccc 
agatggaaca ggtgtaatca attatgeaac aaatatcaca 
acaggcgttg aaacaaacaa cgacaaaaca aatcaaaata 
agaacacaga cattaataaa gatcaaaatc aaaccaaaga 
aaacactgat tatgecgact tacccgaaaa atategtaga 
agagaaatga acaaggaagg cttatttctc ettgettatg 
ccccgacaag cggtttgaeg gcttacccgc tgtattcaaa — 
tacagatatg aattactact agatgaagaa gtateggett 
tagecaaega tatgagtggt catccaaact acttcatcga 
aaatgacaca ctcaaaaaat ggtcgtctga tggtacgtta 
aatcacacca aagaaatcaa aagaccacaa atcttggttg 
ctttgacaaa aaataaaccg gatgetgecg atgatcgaac 
tgactatgga gccgacccta ttgacacgtt acgtattgtt 
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10081 gcaatcaata aagttagtgg ctggaacacc 

10151 gtgtataatg gcagacatta gaacacaact 

10221 aaagccgtta acactatgac taatagcggt 

10291 acgaaacaat gaatacctca gttcaaaacg 

103 62 attaaatgca aatgtcggea aaccaaceaa 

10431 ggtattcgtt atgtagaggt gtaatacggc 

10501 ttaatgcctg ttacaattgc taaaaatgtt 

10571 taagaggtaa cgctagtgaa gctaaaaeac 

10641 agaaactgac acagtaacat caaccgcaaa 

10711 gaacaagcga aaacaacagc aaaeagtatc 

10781 cacaaaaaag tgcaactgat ctagctgttc 

10851 attaccacag gaggaaaaat aacggcaaac 

10921 atccaagtgt tcgagcagaa aacttgttag 

10991 attatatgca gctggtgata aaacaaatgc 

11061 ataaagttta ccgaaagttt gacaaacccc 

11131 caaaacgtac cggttgtttc gcaaaacacc 

11201 atcaatcaca cctgatggca aagtaaccgt 

11271 aattgcgttt cccctctaaa ataaggaggt 

11341 aaagaagaaa atcaaaagaa ttacctattg 

11411 cattgcatgg aaatgaaaac caggagagct 

11481 tgatgtaagt gcttatgggg ttatcgctga 

11551 gaagaaaaaa gcgaaatggg caccactttt 

11621 ctaacaecat tgaattgaaa cgtgatgtac 

11691 gtttgaaaca atgacggcat ctaacgcaaa 

11761 ccacaaggcg cccaaagtgg taaaggaact 

11831 acttgtttgt tcgtaactgt actccaaacg 

11901 atttgaaaat tgcctattet ctaacacccc 

11971 acgtggcaag ggaacgatat caataccagg 

12041 ttcatttttg tacagcgacc attatcgaca 

12111 tcctggcaac acaatcgaag gcggcgcaag 

12181 aacaaccatt ttctagcaca cggaaacaga 

12251 ttgatgtaga tgtttattgt cgtaacccac 

12321 tgttgtttac ggacattacc gaaacttaaa 

12391 acgttgtatg gcggtggegt taatttctat 

12461 accggtttat tcaaacggct gacaatcgag 

12531 aacaaaagta aacacaccaa tgacctataa 

12601 gtgctaacag gtccaaatgc aagtaatgta 

12671 acaaatagct agaggacaaa caatcgctaa 

12741 ggagttgtcg ccaacctcca ttgggaatcg 

12811 gatatgggtc aggtcaatgg acgcccaaaa 

12881 tgctaaagct gaaacgttgg aaggtcaagc 

12951 gataatacac ctgtttcttc tgcaggttat 

13021 atattgatgt tgctacaatt aattttatgt 

13091 acttgatctt gcacaagctt atagtaagca 

13161 ggaaccccaa tcaagaatac aaatcttgat 

13231 caggaaacgg cagaccaaat aatttccatg 

13301 aatgattgca tgttgcgatg gaacagtaac 

13371 ataaatgatg gtacttacaa tatcgtttat 

13441 ttggcgacaa agccaagaac ggacaagttt 

13511 taaaaaagat tttatgactg egttaggacc 

135B1 tttttagggc aatgttttgg agacggagat 

13651 atcttattta tctattgcca tccgatgcct 

13721 agaatatacc acacaatggt tggcagatga 

13791 gcaatgatta tcgattttgt gttaggtttt 

13861 ttaaagctaa agcaggtatc attgttaagg 

13931 agtaaaattc ggtgcagtag gcattacaat 

14001 eatagtatac taggacatat ttcagatacc 

14 071 tagacggaac actcaacaga aaggacgata 

14141 ggaattgatc tttcaaaagc tccatgcgat 

14211 accctgattg tgaccgagca tttcaacaag 

14281 gcatgagagg ggtctagaag gtacacctca 

14351 attggtaaag ctgttcttat tcttgacttt 

14421 ttcttgatta cgtttataat aaaacaggcg 

14491 aactgacttt tctagtactg caaaaggcga 

14561 caaggctacc cccaaccagc gccacctaaa 

14631 gtaaaggacg ttcaccagga cacaacggca 

14 701 ggacctgtat gtaggtaaaa aacaggacca 

14 771 gatgagttta ttttcactct cacaacaggt 

14841 aattgtctga tccaacacaa ctcgatcaca 

14911 atcaatggtg tggacacccg aacaatttga 

14981 taggagtgta tagtatgaca aatagcctag 

150SI caatgcttta ggttttaatt gcctaatgtt 

15121 tataaaaaat ttgttgttaa tcgetttatt 

15191 cagaacttaa aaagattccc caacctctca 

IS 261 aaaaggaaaa gaattctatt gtgatgacaa 

15331 gaaaaatcta atgaatatcc cgaagttcgt 
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gccacaggag atatttatct taacattaaa ggaacggagg 
aacaagtgaa gatggatcag acaatttatt tccaatttca 
acgaatgtag aaggagaatt gggtacactc aaacaaaatg 
ctgtagttac cgccaatcaa gcaaaagatt ctgtagctga 
tcgaataaca acattagaga gcacagcggc caatcctgac 
agataaaaat attcaaatgc aggacaaaga tcataatcgt 
ccaacaggcg actctaatct tgaattagcc aatgctgaaa 
ttgcacaaca agctaaagaa actgctgccg gcttgtcaac 
tcaagcgttg acgaaggctg gtaeagcaca acaaaccgca 
agcgcagctg caacggcagc taaaaacaca gctgarccag 
gagcaagcag tttagaggac acagcaatac aatatactgc 
aaaaatattc aaatgaagga tagcaatgac aataatttat 
atttgaccag tcgcgctgaa ttaacaacga caaattgtca 
aatctcttat cccggtgcag taggtatgct cgaaggtatg 
gtgatcacaa cgctaccaga aggttttaga ccaataagaa 
acacaccaaa tccaacagat acaaaagaaa tggtttatgt 
aaatgacaat gtaggtaaaa tcgaatatct atccctagat 
tcatatggaa gaacgaatcg atattcaaat gaacaagatg 
caccctgaaa cgaacccgaa acaagttgtt tttgatgaaa 
tcaacaattt tgttgacaca agaaaaatga caactacaat 
cggtgtaaca gattgtacac caatattaaa taaattactt 
tattttcctc cttgtgaacg tgattcatat tatcgctttg 
ctgtagttac tttcttagga tcgggagaaa cgacattaaa 
catcgaaagt ttcaatattg atggttttgc attatggttg 
ttctttaatg atactcgcaa ttacaatcgt tttgactttg 
aaggaacgta tgttgttgtt gctagaggta gaggggttac 
tcaagcaatt aecaaaacag cttttcccga tgtaaatggt 
ggtacaggtt ttagaggttt ctttgtgaaa aacaaccgta 
atgacgatga ttatcagaat gtaattaatt tctgtgaaat 
ttattatcga ggatatgcgc ataacttgca tgtccaaaac 
aacgctttgt ttgagtttca agatgtggat caagcttata 
aagtcgaggg aatgaatagt acagctattt cacgtttaat 
gattacaggt aaattatatc gttgtcaagg acatgttatc 
tgtgacttga tggcacaaga agcacctttg acggacggtt 
ttaactacga tgggtttgtt gttcgtggtt tgtctaattc 
agcacctcag actgttttct ataatcgtag aatcgatcat 
tataactagg aggatatgag atggcaactc ttacaaatga 
aatactttca aaatatggce ataataaaaa ttcacaagta 
gctggtttga acccgaacag caatgaatat ggtggaggcg 
gcaatcttta tcgccaagca caaatttgtg ggttgtctaa 
agagatcatc gctcaagggg ataaaacagg tcaatggatg 
actaaccctc agaccctttc agcatttaaa caatctgcaa 
gtcactggga acgccctggt aaacttcata tcgaagaaag 
tattgacggt agcggtggcg gtggcgtaaa acgttgctat 
cctaaaagtt tcatgagtgg acaacttttt ggcacgcatg 
atggtttgga ctttggttca attgatcacc ctggcaatga 
acatgttgga acaatgggag cattaagagc gtattttgtg 
caagaattta gttataacca gtcaaatata aaggtaaaag 
gcgcaatacg tgacgcggat catttacatt taggtttcac 
ttctttcata gatgatggaa catgggaaga ccctttgaag 
actggcggag ataatgacga taacaataag gataaaaatg 
tgaatggttg gaaattttaa taaggagaaa aaggtatgat 
taatcatctt gtttatggtt tgattatatg gttaatggtt 
acaattgcca aatttaacaa ggaaatcgac tttagtagtt 
tggcagaaat ggttttagtg gtttacttta ttcctgtagc 
gtatataaca atgttggttg gtttgatttt atcagaaatt 
gatgatgata ataattggac tgattatgtt aagaagtttt 
ttaaatgatg aatggtattg atatctctag ttatcaaaca 
tttgtaaata ttaaagcaac aggcggaaca ggttatgtaa 
ctctgtcttt aggtaaaaag attggtgcgt accactttgc 
acaagaagcg caattctttt tagataatat taagggttac 
gaagggtcaa atcagaaaga tgtaaattgg gcgaaagcat 
ttaaagcatg gttttatacg tatacagcaa acctcaatac 
ttatggttta tgggttgctg aatatggatc aaatcaacca 
acaaataatt ttccaattgt tgcctgtttt cagtctacaa 
atcttgattt gaatgttttc tatggcgatg gtaatacatg 
aattgttcct cctgaaaata aaatatttga cgccacaagt 
agcacaagcg tgttttattt tgacggagaa acgatctttg 
ttagaggaac atacaatcat gttcatggaa aagaaatccc 
tatttactta aaaatgtatg aaaagaaacc agtatataaa 
gcgttaaact tgaagagaaa aacttatact ataaccctaa 
gtttgtaata ggcgcacgtg gtataggtaa aacttatggt 
aaacacggcg aacaatttat ttatttaaga agattcaaaa 
aaacaatggc gaaagaattt cctgatcata aacttgaagt 
attaatgggt tgggctgttc cacttagtac gtggggaatt 
acaattttgt ttgatgagtt tttaattgag aaatcaaaaa 
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15401 tcactCattC accaaacgaa gctgaagcct 

15471 tacaagacgc gteacgccga gtaatgcaac 

15541 ccagacttga ataagcgccc caatctatat 

15611 actccgcaga agtgaagaga gaaacacctt 

15681 tatcaacaat gagtrtgtca atgatagtga 

157S1 tgcgccactg ctttcgaagg gaaaaecttc 

15821 gccacgacta tcaaccaaac acaaaecatc 

15891 gccgacgaaa aactggcgaa ataattatta 

1S961 cggctcgata acaccgctat caagaactta 

16031 accctagcag agcraccacg atcagcccca 

16101 gcgatagttt tgttttggtt ctttggcgtc 

16171 ggcgtgttaa tgtagacgaa atcttttctc 

16241 aaatgcagcc ataggacgcc catttcctcc 

16311 cggctatatt ctaatgctct tgttaaggtg 

16381 caCaaaacac cgtgacaccg tatattggtt 

16451 cccctcggca tttgtaacgc taactgatag 

16521 cccgacaata ccccccaaga atgttaaatt 

16591 CCcggCgata cttattCccg gaacgtcgaa 

16661 cctgaaaggt tacgtttaca gtagaaacgt 

16731 caatcatttc aattcctcct atctgtccgt 

16801 tgttcaacgc ttttcattga tttcgttatc 

16B71 atttatcatg cgtcaacacg aactcttttg 

16941 tgt tact tec gactcgacag acgctaaacc 

17011 attaaCgaCa aattgctaat catgcaaaac 

17081 ataagattgg cagcactgca tcgaatcaac 

17151 atccatatct aatccctcea gttcctcaaa 

17221 ccaacaagat aatgtttatc gtccccggta 

17291 gaagtagaga tacctctcct ttttcagcta 

17361 aatttgatac tgacaccacc aaccaaatgc 

17431 actgagaaag cccagctatc accaaacgaa 

17501 caaattccaa acagaggaat ttaceaagtt 

17S71 cgaataaact tctgtgtata cgatcggttc 

17641 accatgtatc cacatatatg tcaatcattt 

17711 gatcccccct tcactacacc catactacac 

17781 tgtagtctgg ggtcagctac atctgCgtca 
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taccgaacac gacggaaacg gctctccgaa gacgcacaaa 
tagcgcagtg aacccccacc Ccttgeaecc caatccgcag 
caagaccgag gcacaccgac tgaatcgegc gactcaaaag 
ttggtagatt gacccgcgga acagaacacg aagaccccag 
tacgcctacc gaaaagagaa gcaaaaacag tagttcccta 
gggcaccgga tagacgccga aacaggccgt gcctacgCga 
tttacgcaat gacCacgaaa gaccacgaag aaaat agate 
CCCCtcaaca gtggcgaaag caCtcaagaa tagttatctg 
cattacgacc Cgcccaacaa gatgaaaatc tggcaaccct 
ccacaacgac gaacagcaga taacatagca accgtagtcc 
agcgacctcc gccaacgccc tttcgtccgc tcctggaccg 
acagttcttt ctccctacac agccctaaca accccctgta 
caccccaacg caacccacca taeccacccc taggtataca 
agaggctegg tcccgCgCac caaaaccccc caaccaccta 
ccccgcagaa tgcagccacc accccacccc ccccaaacag 
cgagaaccaa ctcccacgca CgaagccacC aaccccaccg 
gacccgattc gggcaacagc gecgaacgag tcaacaaaag 
accttgeaaa gcccccccta tgaCcCcCaC tccctcaccg 
aaccactcaa ccagcccgcg gcgctccctg aacgcccgtg 
aaCCCgCCCa cacccgccac gCCCcaaCCg tcccgcaCag 
gcgatattaa tgcaatggct aCcaagataa acaCagCcac 
caacgcaatc aacgeacaaa aCcaaCtgct CCcccccCCg 
atcgccgcca cccttagtca gecgacttaa accccccaaa 
acccccttta cacCaactcg acaccgatac caccaaccga 
acgttatccc egcagtttcc catgaacacc eggaaacaag 
agataacaaa caacaccccc caccgcccac cccaccaaca 
cccacgacac gaeaaeccac atcccaccca ccaaaggggc 
ccaacgaccc accgcccaca Cgaaacaccc ccctcatact 
gaccggcagc accgcaccaa accaacaccc tggataattt 
actgctccac ccccaagtaa ccctccagce ccacccaccc 
tacccccacc tccaaaaacc eecaeacaea ccacgccacc 
attcacgcct accaccccct ccctaccaca CatatagCat 
aactcactta ttttaacgat ecacctgacc gtccecttat 
catgtatgat tgtacctgcc aacaattaaa ttcacaCaaa 
tcaaaaaaag ataatactct att 
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Table 22 



Phage 182 ORFs list 



nb 


Name 


Frame 


Position 


Size 

IBM.) 


Key words 


1 


182ORF001 


2 


5966..77B0 I 604 ! Tail protein; 


2 


182ORF002 


1 


2152..3873 I 573 I DNA polymerase; 


3 


182ORF003 


1 


11305..12639 I 444 > 


4 


182ORF004 


3 


4626-5954 i 442 i Major head protein; 


5 


182ORF005 


3 


12651.13700 I 349 ; Glycyi-Glvcine endooeptidase: Lysostaphin precursor 


6 


182ORF006 


1 


14995.. 16026 


343 I Encapsidation protein; ATG/GTP-btnding site motif A; 


7 


182ORF007 


1 


7795.-8775 


326 1 Upper collar protein; 


8 


182ORF008 


2 


14105..14983 I 292 I Lvsozyme: Muramidase; 


9 


182ORF010 


2 


1310..2155 I 281 : Terminal protein; 


10 


182ORF009 


2 


8765..9601 I 278 i Lower collar protein; 


11 


182ORF011 1 1 


9607..10158 I 183 - Pre-neck appendage protein; 


12 


182ORF012 I 3 


10872..11294 I 140 i 


13 


182ORF013 I 1 


10456.. 10860 ! 134 I 


14 


182ORF014 


3 


13716..14108 


130 I Lysis protein; 


15 


182ORF015 


2 


854.. 1225 


123 i Early protein; 


16 


182ORF018 


-2 


16429..16737 


102 > 


17 


182ORF020 


3 


10158..10454 


96 l Leucine-zipper motif: 


18 


182ORF019 


3 


4323..4613 


96 i Head protein; 


19 


182ORF016 


-3 


16749..17033 


94 I 


20 


182ORF022 


1 


12868.. 13149 


93 I 


21 


182ORF023 


-2 


11914..12189 


91 | 


22 


182ORF017 


1 


154..426 


90 I 


23 


182ORF024 


3 


6174..6446 


90 I 


24 


182ORF025 


2 


548..B14 


88 I Early protein; 


25 


182ORF026 


-3 


12999..13259 


86 I 


26 


182ORF027 


-1 


14642-14896 


84 I 


27 


182ORF028 


3 


14430..14672 


80 ! 


28 


182ORF021 


-3 


171 06.. 17339 


77 | 


29 


182ORF030 


-1 


16199..16429 


76 I 


30 


182ORF031 


-3 


8379..8603 


74 I 


31 


162ORF032 


-1 


11195..11413 


72 




32 


182ORF033 


-1 


4727..4942 


71 




33 


182ORF034 


-1 


5951.6160 


69 




34 


182ORF029 


-3 


1741Z.17606 


64 I 


35 


182ORF035 


-3 


15570..15758 


62 I 


36 


182ORF036 


-3 


2127..2315 


62 j 


37 


182ORF037 


-1 


12095..12280 


61 I 


38 


182ORF038 


3 


14769..14951 


60 ! 


39 


182ORF039 


2 


9992..10171 


59 I 


40 


182ORF040 


-3 


16029.. 16202 I 57 ! 


41 


182ORF041 


1 


3886..4056 


56 I Earty protein; 


42 


182ORF042 


-3 


10671.10832 


53 I 


43 


182ORF043 


-3 


10491.10652 


53 | 


44 


182ORF044 


-1 


6299.-6457 


52 I 


45 


182ORF045 


-2 


6571.6729 


52 I 


ifi 

'♦O 


182ORF046 


2 


2372..2527 


51 I 


47 


182ORF047 


-2 


13201.13353 


50 




48 


182ORF048 


-3 


3243-3395 


50 




49 


182ORF049 


3 


1578.. 1724 


48 I 


50 


162ORF050 


2 


8012..8155 


47 I 


51 


182ORF051 


3 


9390..9530 I 46 ! 


52 


182ORF052 


1 


4096..4233 


45 ! 


53 


182ORF053 


2 


15656..15793 


45 


54 


182ORF054 


-2 


8002..8136 


44 i 


55 


182ORF055 


2 


8324-8455 


43 I 


56 


182ORF056 


3 


6549..6680 


43 I 


57 


182ORF057 


-3 


8133..8264 


43 ! 


58 


182ORF058 


-1 


5048..5176 


42 I 


59 


182ORF059 


-2 


15748.. 15876 


42 \ 


60 


182ORF060 


-3 


15276.. 15404 42 I 


61 


182ORF061 


-3 


1974..2102 


42 f 


62 


182ORF062 


-2 


1867..1992 


41 I 


63 


182ORF063 


-3 


14181.14306 


41 I 


I 64 


182ORF064 


-2 


7234..73S6 


40 I 
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65 


182ORF065 


-2 


3460..3582 I 40 


66 


1 S2ORF066 


1 


4234..43S3 


39 I 


67 


1 82ORF067 


-1 


13763.. 13882 


39 : 


68 


182ORr068 


-1 


7148..7267 


39 i 


69 


182ORF069 


-3 


4908..5027 


39 i 


70 


182ORF070 


-3 


912..1031 


39 I 


71 


182ORF071 


2 


11741..11857 I 38 I 


72 


182ORF072 


•3 


11610.. 11723 ! 37 i 


73 


182ORF073 


-3 


2763..2S76 


37 i 


74 


182ORF074 


-1 


8813..8923 


36 . 


75 


182ORF075 


-3 


7353.-7463 


36 i 


76 


182ORF076 


-3 


2316..2426 


36 » 


77 


182ORF077 


2 


11858..11965 


35 ' 


7B 


182ORF078 


-2 


7564..7671 


35 I 


79 


182ORF079 


•2 


7381. .7488 


35 i 


80 


182ORF080 


-2 


4372.-4473 


33 i 
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Table 23 

Predicted amino acid sequences of ORFs from phage 1 82 

182ORP001 



5966 acggcaagaaggtatacaaatgtaaaattgctggctaacgtgccttttgataacacctatacacacacaagatggtttaaaact 

1 MARRYTNVKLLANVPFDNTYTHTRWFKT 

6050 caacaggaacaggaatcgtactttaattcgtttcctgttcttaacgagaatagagattgttcttatcaaagggatacacaactc 

29 QQEQESYFMSFPVLNENRDCSYQRDTQL 

6134 gggggagtttttagagtagataaacacaaagacgccttatatgcttgtaactatctcatctttaaaaacgaagaaacttatcct 

57 GGVFRVDKHKDALYACNYL I FKNEETYP 

6218 agtaaatggcagtatgcctttgttactgatattgaatataagaatgacaacacaagtttcgttacctttgaaattgatgtttta 

85 SKWQ YAFV TDIEYKNDNTSFVTFEIDVL 

6302 caaacttatcgtttcgatattggtatacgagaaagtttcattgcaaaagaacaccctcaactttattattcgaatggaatacct 

113 QTYRFDIGIRESFIAKEHPQLYYSNGIP 

6386 ttcattaatacaattgaagagtcgcttgattacggtagagaatacacaacaacaaatgtaacaacttttcatcctaacgatgga 

141 FINTIEESLDYGREYTTTNVTTFHPNDG 

6470 gtcaattttcttgttattctaacaagtgaagcaatgccagttggagataaggaagataaatcaggaggatcaatagtaggtggc 

169 VMFLVILTSEAMPVGDKSDKSGGSIVGG 

6554 ccatctcctttttcctattatttacttcctaccaattcaagtggggaggtatacaaaccaaatggggcaggcaatgctaatttt 

197 PSPFSYYLLPINSSGEVYKPMGAGNANF 

6638 ggagagtacatggcgtttcttacaacgaaagaaccttttttaaat aagat agtcgggatgtatgtaaegtcgtatacaggtata 

225 GEYMAFLTTKEPFLNKIVGMYVTSYTGI 

6722 ccat t cat tgt ggat cacgcgaacaaaacggt aagg tat aatgcaggaggt c ct t a t aagat cat gc 1 1 ccaacc t acgctagt 

253 PF I VDHANK.TVRYNAGGS Y K I ML PTYAS 

6806 gatccaacaggaacaatgaaaacattcgctttcttttgtgtaaaagaagcaagaacattcgtacctaaaagaattgatcttgta 

281 OPTGTM KTFAFPCVKEART FVPKRIDL V 

6890 gggaacgtgtataactactttagagaagcttttccgtttaatgttaaggaatcaaaactatttatgtatccctattgtttaata 

309 GKVYNYPREAFPFNVKESKLPMYPYCLI 

6974 gaaattacagatacaaaaggacatgtaatgactttaagacctgaatatcttacaggtggcaaattgagtgtatatgtaaaaggt 

337 BITDTKGHVMTLRPEYLTGGKLSVYVKG 

7058 tcgttaggaatttctaataaagtgatgatcgagccgattgattatgatgtaagtaactcaaccattattaccaatttaagtgac 

365 SLGISNKVMIEPIDYDVSNSTIITNLSD 

7142 aagatgttaatcgataatgatcctaacgatgtaggagttaaatctgactatgcttctgcactcatgcaaggaaacaaaaactcc 

393 KMLIDNDPNDVGVKSDYASAFMQGNKNS 

7226 ttgattgctcaagagcaaaacatttcgcaatactttcagacatggtatgggaaacagtgcaatgagtacaggaggagcgatcttt 

421 LIAQBQNIRNTFRHGMGNSAMSTGGAIF 

73io tcagccttagcaagtaacaacccttttgttggtttgactaacatcatgggagcaggacaacaagtaaacaactacgtttctgaa 

449 SALASKNPFVGLTNIMGAGQQVMMYVSE 

7394 aaagaaaacggtttgaacctcttggcaggtaaagtggcagatatcgaaaatattccagataatgtaacacagcttggatcaaac 

477 KENGLNLLAGKVAD1BNI PDNVTQLG SN 

74 78 ttatctcecacaacaggaaactttcaaaactattatcaactgcgcttcaaacaaattaaatatgagtatgcaacaagacttgat 

505 LSFTTGHPQNYYQLRFKQI KYBYATRLD 

7562 cgttacttctcaatgtatggcacaaagagcaatcgagtagctacaccaaacttacaaacaagaaaagcatggaatttcattaaa 

533 RYFSMYGTKSNRVATPNLQTRKAWMFIK 

7646 ttaaaagaaccaaatattgtaggcacaacgagtaacgatgtattaacacgcgtgaaacaaatttttagtgcaggcgttacgctt 

561 LKE PMIVGTMSNDVLTRVKQI FSAGVTL 

7730 tggcatacgaatgatgttttgaattataaceaagacaacggagatg'tatag 7780 

589 WHTNDVLHYN. QDNGDV* 
182ORF002 

2152 atgattaagaaatatactggcgactttgaaacaacaactgatctcaacgattgtcgtgtatggtcgtggggcgtatgcgatata 

1 MIKKYTGD FETTTDLNDCRVWSHGVCDI 

2236 gacaacgttgacaatatgacgctcggtttagaaatcgattctttttttgagtggtgtaaaatgcaaggcagcacagacatttat 

29 DMVDMMTF GLEIDSFFEWCKMQGSTDIY 

2320 ctccacaacgaaaaatttgacggagagttcatgctttcacggttactcaaaaatggtttcaaatggtgtaaagaagcaaaagaa 

57 PHNEKFDGEFMLSWLFKNGFKWCKEAKB 

2404 gatcgaacattctccacactcatatcaaatatgggtcaatggtatgctttggaaatttgttgggaagttaattacacaacaaca 

85 DRT FSTLI SNMGQWYALE I CW EVNYTTT 

2488 aaatcaggtaaaacgaaaaaagagaaatctcgaacaataatttatgatagccttaaaaaacatccttttccagcgaaacaaatt 

113 KSGKTKKEKSRTI IYDSLKKYPPPVKQI 

2S72 gcagaagcttttaattttcctataaaaaaaggcgaaaragattatacaaaagaaagacctattggttacaaaccaacaaaagat 

141 AEAFMPPIKKGEIDYTKERPIGYKPTKD 

2656 gaatgggagtatttaaagaacgacattcagattatggcgatggcatcaaaaattcaattcgateaaggactaactcgaatgact 

169 EWEYLKNDIQIMAMALKIQFDQGLTRMT 

274 0 agaggaagcgacgctttaggcgattacaaagattggctaaaagctacacatggaaaatcaactttcaaacaatggtttcctatt 

197- RGSDALGDYKDW LKATHGKSTFK.Q tf'^F. * I ' 

2824 etgrctttagggtttgataaagacttacgtaaagcatacaaaggcggcttcacttgggtaaacaaagtttttcaagggaaagaa 

225 LSLGFDKDLRKAYKGGFTWVMKVFQGKE 

2908 ataggtgacggcattgtctttgatgtcaactctttgtatccctctcaaacgtacgcaagacctttaccatatggaacacctcta 

253 IQDGIVFDVNSLYPSQMYVRPLPYGTPL 

2992 1 1 ct acgaaggagaatacaaaccgaacaaegac t at ccgct g t acat t c aa aat at caaagt aagat tccgtt t aaaggagggt 

281 PYEGEYKPNNDYPLYIQNIKVRFRLKEG 

3076 tatactccaaccattcaagctaagcaaagttcattattcattcaaaacgaatatcctgaatcaagtgtaaacaagttaggagtt 
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309 YIPTIQVKQSSLFIQNEYLESSVKKLGV 

3160 gacgaattaatcgatcEtactcttacaaatgttgacctagaattattttttgaacactacgatatettagagatacattacact 

337 DELIDLTLTNVDLELFFEHYDILEIHYT 

3244 tacggatatatgttcaaagcttcttgtgatatgttcaaaggctggatcgataaatggatcgaagcaaagaacaccaccgaaggg 

365 YGYMFKASCDMFXGWIDKWIEVKNTT2G 

3328 gctagaaaagetaacgccaaaggtacgttaaatagcttgcatggaaagttcggaacaaaccctgacattacaggaaaagtgccc 

393 ARKANAKGMLiNSLYGKPGTNPD ITGKVP 

3412 taeatgggcgaggacggcattgttcgattgacaetaggagaagaagaattaagagatcctgtttatgttccgctcgctagtttt 

421 YMGEDGIVRLTLGEEELRDPVYVPLASF 

3496 gtgacggcttggggtagatatactaccattacaaccgctcaaaaatgttttgatcgcattatttattgtgacacagacagcatt 

449 VTAWGRYTTITTAQKCPDRIIYCDTDSI 

35B0 catctagtaggaacagaagttccagaagcaatcgatcacttggttgatcctaaaaaacttggctattgggggcatgaaagcaca 

477 HLVGTEVPEAIDHLVDPKKLGYWGHEST 

3654 ettcaacgagcaaaattcaetcggcagaaaacatacgtagaagaaattgatggcgaattaaatgtaaagtgtgctggtatgcca 

505 FQRAKFIRQKTYVEEIDCELNVKCAGMP 

374 8 gatcgaataaaagagattgtaacttccgacaactccgaagctggttcttcaagccatggaaagctgctacctaaaagaacacaa 

533 DRIKEIVTFDNFEVGPSSYGKLLPKRTQ 

3832 ggtggcgtggtattagtagacacaatgtttacaaccaaacaa 3873 

561 GGVVXjVDTMPTI K • 
192ORP003 

11305 atggaagaacgaattgatattcaaatgaacaagatgaaagaagaaaatcaaaagaattacctattgcaccctgaaacgaacccg 

1 MEERIDIQMNKMKEENQKNYLLHPETHP 

11389 aaacaagttgtttttgatgaaacattgcatggaaatgaaaatcaggagagtttcaacaattttgttgacacaagaaaaatgaca 

29 KQVVPDETLHGNENQESPNNFVDTRKHT 

11473 accacaaccgacgtaagcgctcatggggctaccgctgacggtgtaacagattgtacaccaatattaaataaactacttgaagaa 

57 TTIDVSAYGVIADGVTDCTP I LHKLL SS 

11557 aaaagcgaaatgggcatcaccccctatccccctccctgcgaacgtgactcatattatcgcttcgctaacaccactgaatcgaaa 

85 JCSEHGITFYFPPCERDSYYRFANTIELK 

11641 cgtgatgtacctgtagttactttcttaggatcgggagaaacgacattaaagtttgaaacaatgacggcatttaatgcaaacatc 

113 RDVPVVTFLGSGETTLKFETMTAPNVMI 

11725 gaaagtttcaatattgatggtcttgcattatggttgccacaaggcgctcaaagtggtaaaggaattttctttaacgatactcgc 

141 BSPNIDGPALWLPQGAQSGKG I PPHDTR 

11809 aattacaatcgttttgacttcgatctgttcgttcgcaaccgtacttcaaatgaaggaacgtatgttgttgctgctagaggtaga 

169 NYJIRFDFD1.FVRNCTLHBGTYVVVARGR 

11893 ggggttacatttgaaaattgtctattccctaacacctctcaagcaattatcaaaacagcttctcccgatgtaaatggtatgtgg 

197 GVTPBNCLFSNISQAIIXTAPPDVNGMW 

11977 caagggaacgatatcaatactaggggtacaggtcetagaggtttctttgtgaaaaacaaccgcattcactcttgtacagcgatc 

225 QGND INTRGTGPRGFFVKNNR IHPCTAI 

12061 attatcgacaatgacgatgattatcagaatgtaattaatttctgtgaaatttctggtaacacaatcgaaggtggcgtaagttat 

253 X IDHDODYQNVIMFCEISGMTIBGGVSY 

12145 catcgaggatacgcgcataacttgcacgtccaaaacaacaaccattttctagcatacggaaatagaaaegccttgcttgagttt 

2B1 YRGYAHNliHVQNMNHFLAYGNRNALPBP 

12229 caagatgtggatcaagctcacactgatgtagatgtttactgtcgtaactcacaagtcgagggaatgaatagtacagccatttca 

309 QDVDQAYIDVDVYCRNSQVEGMNSTAIS 

12313 cgtttaattgttgtttacggacattaccgaaacttaaagattacaggtaaattatatcgttgxcaaggacatgttatcacgttg 

337 RLIVVYGHYRNLKITGKLYRCQGHVITI* 

12397 tatggcggtggcgttaatctccattgtgactcgacggcacaagaagcacctttgacggacggttaccggcttattcaaacggct 

365 YGGGVNFYCDLMAQEAPLTDGYRF IQTA 

12481 gacaat cgag ttaact atgatgggt 1 1 gt tgt ecgtggt t tgt ct aact caacaaaagt aaacacaccaa egat ct ataaagca 

393 OMRVNYDGFVVRGLSNSTKVNTPMIYKA 

12565 cctcagactgttctctataatcgtagaatcgatcatgtgctaacaggtccaaatgcaagtaacgtatataactag 12639 

421 PQTVP YNRRIDHVLTG PNAS MVYN * 
182ORP004 

4626 atggctgacaaaatcacagaacaagatgttcttcgtgccacaaatgtagaaacaccagtacaattaatgactgctatttataat 

1 MADKITEQDVLRATNVETPVQLMTAIYN 

4710 agttcatcatctctctctcaggcgaacgtacctacgccaaatgcagataacatcgaagcggttggtgcagggatcacacgctta 

29 SSSSLPQANVPMPNADNIEAVGAGITRL 

4794 gacgtagtaaaaaacgaatttatttcaacttcagttgaccgtattggtaaagtagttatccgacacaaatcttggcgtaaccct 

57 DVVKNEFI STLVDRIGKVVIRYKSWRHP 

4878 ttgaaaatgcteaaaaaaggaaacacgcctctaggccgaacgactgaagaaactcccgtegacattgcacaggaacataagtcc 

8S LKMPKKGNMPLGRTIEBI FVDIAQEHKF 

4962 aaccctgacgagtctgttacaggggtatttaaacaggaagttcccgatgtaaaaacattgttccacgaaattaatcgtgaaggt 

113 MPDESVTGVFKOEVPDVKTLFHBIHRBG 

5046 tactacaaacaaacgatccaagaagcacggttagaaaaagcatttacttcatgggataattccaatagtctcgttgctggtgta 

141 YYKQTIQBAWLEKAFTSWDNFNSFVAGV 

5130 atgaacgct 1 1 at acacaggt gacgaagt aagcgaatt tgaacacacgaaat t at caat agcaaact accaagaaaaagagct a 

169 HNALYTGDEVSEFEYTKLLIANYQEKEL 

5214 ttcaaagagategaaattggcgaaattactgaatcaaatgcaaaagaatttatccgtaagatcaaatcaacctqtaacaaatta 

197 FKE I E IG E I TESNAK8F 1 RX I KS'TS _J1 "X L 

5298 gaatttatgagttccgcttacaacgctcaaggagttaaaacatctacctcaaaatctgatcaatacgttat»attgacgccgac 

225 EPMSSAYNAQGVKTSTSKSDQYVIIDAD 

S3B2 acagacgcaaccattgacgttgacgtttcagcagcggcattcaatatgagtaaaactgactttgtaggacacaaaatcgttatt 

253 TOATIDVDVLAAAFNMSKTDFVGHK2VI 

5466 gatgagtttcctaaaaaagaaggcgaagaatcgccaaatattgtggcagttattgtagatagtgaatggtttatgatctacgac 
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281 DEFPKKEGEESSNIVAVIVDSEWFMIYD 

5550 aaattgtacaaaacaacaagtctacacaaccccgaagggttacattggaattattggttgcaccaccaccaactatattctact 

309 KLYKTTSLYNPEGLYWNYWLHHHQLYST 

5634 tctcaattcgggaacgctgccgcttttgttaaatcagcaacaaaacctgccacaaaagttgctcttgcaagtgcaacaactagt 

337 SQFGNAVAFVKSATKPVTKVAFASATTS 

571B gttgttaaaggaecatctaaagataccgcattgacatctacaccagtagaagcaacaaaccaacaaggagaagctgtttcatca 

365 VVKGSSKDIALTFTPVEATNQQGEVVSS 

5802 gcaccagcaetggttaaggcaaccgtaaaacaaacagcaggtaaagcgactgccgtaaccgtagaaggcttagaagtcggtcaa 

393 APALVKATVKQTAG KATAVTVEGLEVGQ 

5886 ccattagcaacattcacagceatcggaggtcaacaagcaacggttcctgttacggttacttctgactaa 5954 

421 SLVTFTAIGGQQATVliVrVTS D * 
182OR7005 

12651 acggcaac t ct t acaaatgaacaaatag c t agaggacaaacaac cgc c aaaatacc 1 1 caaaat at ggctat aat aaaaat c ca 

1 MATLTNEQIARGQTIAKILSKYGYNKNS 

12735 caagcaggagccgtcgccaatctccattgggaaccggctggtttgaacccgaacagcaatgaatatggtggaggcggacatggg 

29 QVGVVANLHWESAGLNPNSNEYGGGGYG 

12819 ttaggtcaatggacgcctaaaagcaatccttatcgccaagcacaaatttgtgggttgtctaacgctaaagctgaaacgttggaa 

57 LGQWTPKSNLYRQAQICGLSNAKAETLE 

12903 ggt c aagcagaga tcatcgc tcaaggggat aaaacaggt caatggat ggat aat acacctgt: 1 1 c 1 1 ctgcaggtt atactaac 

85 GQABIIAQGDKTGQWMDHTPVSSAGYTM 

12987 cctcagacccttccagcatttaaacaatctgcaaatattgatgttgctacaaccaatcttatgcgtcactgggaacgccccggt 

113 PQTLSAFKQSANIDVATIMFMCH WERPG 

13071 aaacctcatatcgaagaaagacttgatcttgcacaagcttacagtaagcatattgacggtagcggtggcggtggcgtaaaacgt 

141 KLHIEBRLDLAQAYSKHIDGSGGGGVKR 

13155 tgct atggaacc ccaat caagaa t acaaat ct tgat cct aaaagt 1 1 catgagtggacaact 1 1 1 tggcacgc a c gcaggaaac 

169 CYGTPIKWTNLDPKSFMSGQLFGTHAGN 

13239 ggcagaccaaataatttccatgatggcttggaccttggttcaactgatcaccctggcaacgaaatgattgcacgctgcgatgga 

197 GRPNtfFHDGLDFGSIDHPGNEMIACCDG 

13323 acagtaacacatgttggaacaatgggagcattaagagcgtattttgtgataaatgatggtacttacaatatcgtttatcaagaa 

225 TVTHVGTMGALRAYFVI MDGTYMIVYQE 

13407 tt e agt t at aaccagt caaaca c aaaggt aaaagt t ggcgacaaagt t aagaacggacaagt 1 1 g cgcaat acgtgacgcgga c 

253 PSYNQSNIKVKVGDKVKNGQVCAXRDAD 

13491 catttacatttaggttttactaaaaaagattttatgaetgcgttaggatcttctttcatagatgatggaacatgggaagaccct 

281 HLHLGFTKKDFMTALGSS PIDDGTWEDP 

13575 ttgaagtttttagggcaaegttttggagatggagatactggcggagataatgacgataacaataaggataaaaatgatcttatt 

309 LKFLGQCPGDGDTGGDNDDHMKDKNDLI 

13659 tatctattgctatccgatgccttgaatggttggaaattttaa 13700 

337 YLLLSDALNGWKF* 
182ORF006 

1499S atgacaaatagcttaggcgttaaacttgaagagaaaaacttatactataaccctaacaatgctttaggttttaattgcctaatg 

1 MTNSLGVKLBEKNLYYNPNNALGFNCLN 

15079 ttgtttgtaataggcgcacgtggtataggtaaaacttatggttataaaaaatttgttgttaatcgctttattaaacacggcgaa 

29 LFVIGARGIGKTYGYKKFVVNRPIKHGE 

15163 caatttatttatttaagaagattcaaaacagaacttaaaaagattcctcaatttttcaaaacaatggcgaaagaatttcctgac 

57 QFIYLRRFKTELKKIPQFPKTMAKEFPD 

15247 cataaacttgaagtaaaaggaaaagaattctattgtgatgataaattaatgggttgggctgttccacttagtacgtggggaatt 

85 HKLEVKGKBFYCDDKLMGWAVPLSTWGI 

15331 gaaaaatctaatgaatatcccgaagttcgtacaattttgtttgatgagtttttaattgagaaatcaaaaatcacttatttacca 

113 BKSNEYPEVRTILFDEFLIEKSKITYLP 

15415 aacgaagcegaagccttattgaAcatgatggaaacggttttccgaagacgtacaaatacaagatgtgetatgttgagtaatgca 

141 MEAEALLMMMBTVPRRRTNT RCVMLSNA 

15499 actagtgtagtgaacccttattccttgtatttcaacccgcagccagatttgaataagcgttttaatctataccaagatcgaggt 

169 TSVVNPYFLY FNLQPDLKKRPNLYQDRG 

1SS83 atattgattgaattgtgtgattcaaaagacttcgcagaagtgaagagagaaacaccttttggtagattgattcgtggaacagaa 

197 ILIBLCDSKDFAEVKRETPFGRLIRGTB 

15667 tacgaagattttagtatcaacaatgagtttgtcaatgatagcgatacgtttattgaaaagagaagtaaaaatagtagtttctta 

225 YEDPSINNBFVMDSDTFIBKRSKNSSFL 

15751 tgcgccattgcttttgaagggaaaatctttgggtattggatagacgctgaaacaggttgtgtctatgtgagttatgattatcaa 

253 CAIAFEGKIFGYHIDAETGCVYVSYDYQ 

15835 ccaaatacaaatcatttteaegcaatgactacgaaagaccatgaagaaaatagattgctgatgaaaaattggcgaaataatcat 

281 PHTHH PYAMTTKDHEENRLLMKMKRNNY 

1S919 tatctttcaacagtggcgaaagcattcaagaatagttatctgcggtttgataacattgttattaagaatttacattatgatttg 

309 YLSTVAKAPKNSYLRFDN1V1KNLHYDL 

16003 tttaataagatgaaaatctggtaa 16026 

337 FHKMKIH* 



1820R7007 

7795 atgagtagacgaaaaggtgcaggacttgctagaaataaccgttatacagcaaaaagcagaccttatccaaatgaaccctattca 

l hsrrkgaglarnnrytaksrpypn.ep'ys 

7879 agtgatgtagaagaaatcagctactatgaacattatcgtagacaaetcacgctccttacgtttcagttgtttgaaegggaaaat 

29 SDVEEISYYEHYRRQLTLLTFQLPEWEN 

7963 ttgccaaaaccaattgaccctcgttatttagaaattgccccacacaccaatggttatcttggcttctttaaagaccctacacct 

S7 LPKSIDPRYLEIALHTNGYLGFPKDPTL 

8047 gggcccatggtttgcgcaggggcagaagatggtcaaatcgaccatcaccacaaccctattttctttacagcaaacgaagcaacg 
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85 GFMVCAGAEDGQIDHYHNPIFFTANEAM 

8131 tatcacaagagatatcctgttttaagatatgatgatgatgatgataaatcaaaatgtatcatgttgtataataatgacttga.aa 

113 YHKRYPVLRYDDDDDKSKCIMLYNN DLK 

8215 gttcctacgttaccaagtttacatcgttttgctttagatatggcggacataaaccagatatcacgagtgaatcgaagagcgcaa 

141 VPTLPSLHRFALDMADINQI S R V N R R A Q 

8299 aaaacacctgtaattattcaaactgatgaaaagaaatacttctcattgctacaagcttataaccaaattgacgaaaataatcag 

169 KTPVriQTDEKKYFSLLQAYNQIDENNQ 

8383 gctgtttctgtggataaagatatggagtttgacgaaccttttaatgtacggcaaacaaatgctccatatgtagtagataaacea 

197 AVFVDKDMEFDES FNVWQTNAPYVVDKL 

8467 cgatcagaattgaacgaagtatggaatgaagtgttaacttttctaggtatcaacaatgctaacgtagataagaccgcacgtgta 

225 RSELNEVWNEVLTFLGINNANVDKTARV 

8551 caaacatcagaagtcttatctaacaatgaacagattgaaagttcaggtaacatcttgttaaaatcaagaaaagagttttgcgat 

253 QTSEVLSNNEQIESSGNILLKSRKEFCD 

8635 cgtgtaaafccgtgtctttggcgatgaacttgaeggaaagattgacgtgaagtttagaacagacgccgttcgacaattacaactg 

281 RVNRVFGDELDGKIDVKFRTDAVRQLQL 

8719 gcggcaggtcaatcaaaaaaagaccagatgagtggagggttgccaagtgctacttaa 8775 

309 AAGQSKKDQMSGGLPSAT • 
182OR7008 

14105 atgatgaacggtattgatatctctagttatcaaacaggaattgatctttcaaaagttccatgcgattttgtaaatattaaagca 

1 MMNGIDISSYQTGIDLSKVPCDPVNIKA 

14189 acaggcggaacaggttatgtaaaccctgattgtgaccgagcatttcaacaagctttgtctttaggtaaaaagattggtgtgtat 

29 TGGTGYVNPDCDRAFQQALS LGKKIGVY 

14273 cattttgcgcatgagaggggtttagaaggtacacctcaacaagaagcgcaattctttttagataatattaagggttacattggt 

57 HPAHERGLEGTPQQEAQFFLDNIKGYIG 

14357 aaagctgttcttattcttgactttgaagggtcaaaccagaaagatgtaaattgggcgaaagcatttcttgattatgtttataat 

85 KAVLILDFEGSNQKDVNWAKAFLDYVYN 

14441 aaaacaggcgttaaagcatggttttatacgtataeagcaaacctcaatacaactgatttttctagtattgcaaaaggcgattat 

113 KTGVKAWFYTYTANLNTTDFS S IAXGDY 

14525 ggtttatgggttgctgaatatggatcaaatcaaccacaaggctactctcaaccagcgccacctaaaacaaataattttccaatt 

141 GLWVAEYGSNQPQGYSQPAPPKTNNFPI 

14609 gttgcctgttttcagtttacaagtaaaggacgtttaccaggatacaacggcaatcttgatttgaatgttttctatggcgatggt 

169 VACFQFTSKGRLPGYNGNLDLNVFYGOG 

146 93 aatacatgggat c t gcatgt aggt aaaaaacaggac caaat tgc t cct c ctgaaaataaaat at 1 1 gacgc cac aagt gat gag 

197 NTWDLYVGKKQDQIVPPENKI FDATSDE 

14777 tttattttcactcttacaacaggtagcacaagcgtgttttattttgacggagaaacgatctttgaattgtctgatccaacacaa 

225 FIFTI»TTGSTSVFYFDGETI FELSDPTQ 

14861 ctcgatcatattagaggaacatacaatcatgttcatggaaaagaaatcccatcaatggtgtggacacctgaacaatttgatatt 

253 LDHIRGTYNHVHGKEI PSMVWTPEQFDI 

14945 tacttaaaaatgtatgaaaagaaaccagtatataaatag 14 983 

281 YLKKY5KKPVYK* 
182ORFO09 

8765 gtgctacttaaacgttacattgaaagtttcacttattaccaacctgaattatctcgaaaagaacgtattgaagttggccgaaaa 

1 VLLKRYIESPTYYQPELSRKER X BVGRK 

8849 caattgtttgattttgattatccgttttatgacgaaacaaaacgagcagaatttgaaacaaaatttatcaatcacttttacttg 

29 QLFDFDYPFYDETKRAEFETKF IHHFYIi 

8933 agagagataggctcagaaacgatgggateatttaagtttaatcttgacgaatatttaaatctaaacatgccctattggaataaa 

57 REIGSETMGSFKFNLDEYLNLNMPYWNK 

9017 atgttectatcaaatcttgaagagtttccgatttttgatgacacggactacaccattgatgagaaacagaaattgttaaatgag 

85 MFLSKLEEFP1FDDMDYTIDEKQKLLME 

9101 at tgatacaaacat caaagcgaat cgt gatgaat cgaagaaccaaacgaagcaagt agat caaacagacaacagaaacaaaaat 

113 IDTNIKANRDESKNQTKQVDQTDNr. MKN 

918S acacgtgacacaggaacaaccgattctttctcaaggaacacttatacagacacccctcaaaaagatttgagaattgccagcaat 

141 TRDTGTTDSFSRNTYTDTPQKDLRIASM 

9269 ggagatggaacaggtgtaatcaattatgcaacaaataccacagaagatttgagtaaagaaacaacaagctccacaggcgttgaa 

169 GDGTGVINYATN ITEDLSKETTSSTGVE 

9353 acaaacaacgacaaaacaaatcaaaatacacgaagcaacgcttctgaaaaagaaacaaagaacacagacattaataaagatcaa 

197 TNNDKTNQMTRSMASEKETKMTDIMKDO 

9437 aatcaaaccaaagat acgat tac acgatataaaggt aaaaagggaaacactgat t at gctgact t act cgaaaaat a t cgt aga 

225 NQTKDTITRYKGKKGNTDYADLLEKYRR 

9521 agtgttttgagaattgagaaaatgatctttagagaaatgaacaaggaaggcttatttccccttgtttatggagggaggtag 
9601 

253 SVLRIEKMIFREMNKEGLFLLVYGGR* 



132ORF01O 

1310 ttgaccgtaagaatatcaaagaatgatagagccaagttagagaaaatctacggtaaaectaacaaagctcgtaaaaaatacaat 

1 LTVRISKMDRAKLEKIYGKSNKARKKYN 

1394 cgtttaagacaaaaaggagttgaggaaaggcaacttccaactgttccaacatcaaagaaaagacttattgactacgtaaaatca 

29 RLRQKGVEBRQLPTVPTSKKRLID Y V K S 

1478 acaaatatgagtcgtagtgattttaacaagacgttagacgagttggtagactttgcacaaccttacaacgagaattaoattttt 

5? TNHSRSDFNKMLDELVDFAQPYNEJI~YIF 

1562 gagatcaacaagcgaaacgttgcaatctcaagagcgcaaatcaaagaagcgcaaactaaaacagagcaagctcaaaaagcgaaa 

85 EINKRNVAXSRAQI KEAQZKTEQAQKAK 

1646 gaagaacactacaaagagcttaacaaagttgaagttaagaagcccacagaaaacacaattgtcacaccaactattttaacagag 

113 BEHYKELNKVEVKKPTENTI VTPT2LTB 

1730 ttaggtgctgacttaccttttcaagcaataccagattttaatattgacgctttcacttccccagaaggagttcagtctcattta 
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141 LGADOPFQ AI PDFNIDAFTS PEGVQSYL 

1814 gaaaatataggaaaacaagacgaacaacattttgacgaaagagaccaacttcactacgacaacctcagacaagcgatgtttact 

169 ENIGKQDEQYFDERDQLYYDNFRQAMFT 

1698 attttcaattcagacgctgacgatactgtccgtctacctgacccaatggggctcgatctacttacgaaaacacacgctagtaac 

197 IFNSDADDIVRLLDSMGLDLFMKTYVSN 

19B2 ttcttagacacgaacctcgactacatttacgacgaagcagaagcacaacagaaaaaagaacaagcctacagtaagactgcaaaa 

225 FLDHNLDYIYDEAEVQQKKEQVYSKIAK 

2066 gtgatcgagtctgaaacaggtggagaagtcccctcatataaccccacgaagaacatcacaattaattcagaaacaggagaagaa 

253 VIESSTGGEVPSYNPTKNITINSETGEB 

2150 ctacga 2155 

281 L * 
182ORF011 

9607 atggtagattttaaccccgacaagcggtttgacggtttacccgctgtattcaaagaacgctttagcaaataccctcatactgaa 

1 MVDFNPDKRFDGLPAVFKBRFS KYPHTE 

9691 tacagatatgaattactattagatgaagaagtatcggcctcaattgcctatctgaatgaagtcggcgctccagttaatgatatg 

29 YRYEI>LLDEEVSAL1AYLMEVGALVNDM 

9775 agtggttatttaaattactttatcgaacactttgttgagaagtcagaagagaccacaaatgacacactcaaaaaatggttgtct 

57 SGYLNYF I EH FVE KLEEI TND T LKKWLS 

9859 gacggtacgttagaaaatttaaccaatgatactgtttttgcaaattatatcaaagaaatcaaaagattacaaatctcggctgct 

85 DGTLBHLINDTVFANYIKEIKRLQILVA 

9943 gaaacacgtgccaacagtgtgaatattcttttgacaaaaaacaaaccggafcgttgctgacgaccgaacatettggtataagatt 

113 ETRANSVNI LIjTKNKPDVADD RTFWYKI 

10027 caacgcgacaataetgattatggagccgatcctactgacacgttacgtattgttgcaatcaataaagttagtggctggaatacc 

l4 l QRDNTDYGADPIDTLRIVAIMKVSGWMT 

10111 gctacaggagatatttatcttaacattaaaggaacggagggtgtataa 10158 

169 ATGDIYLNIKGTEGV* 
18JOR7012 

10872 atggcaaataaaaatattcaaatgaaggatagcaatgacaataatttatatccaagtgttcgageagaaaacttgttagatttg 

1 MAKKNIQMKDSNONHLYPSVRAEHLLDL 

10956 accagccgtgctgaattaacaacgacaaactgtcaattacacgcagccggtgacaaaacaaatgcaatccctcacctcggtgca 

29 TSRAELTMTHCQLYAAGDKTMAI3YLGA 

11040 gtaggtacgcccgaaggtacgacaaagtttactgaaagtttgacaaaccctgtgatcacaacgccaccagaaggccttagacca 

57 VGMLEGMI KFTBSLTNPVITTLPBGFRP 

11124 ataagaacaaaacgtactggttgrttcgcaaaaeatcacacaccaaatccaacagaeacaaaagaaacggtttacgtatcaacc 

8S IRTKRIGCFAKYYTPNPTDTKEMVYVSI 

11208 acacctgatggcaaagtaactgtaaacgacaatgtaggcaaaatcgaatatctatccctagataattgcgttttccctctaaaa 

113 TPDGKVTVNDNVGKIBYLSLONCVFPLK 

11292 taa 11294 

141 * 
182ORF013 

10456 atggcagataaaaatattcaaatgcaggataaagaccataatcgcttaatgcctgtcacaattgctaaaaacgttctaacaggc 

1 MADKNIQMQDKDHNRLMPVTI AKNVLTG 

10540 gact ctaat ct tgaat t agt t aat get gaaat aagaggt aacget agtgaagct aaaacact tgcacaacaagct aaagaaact 

29 D5NLELVNAE IRGNASEAKTLAQQAKET 

10624 gctgctggcttgtcaacagaaattgacacagtaacaccaaccgcaaatcaagcgttgacgaaggccggtacagcacaacaaacc 

57 AAGLSTBIDTVTSTANQALTKAGTAQQT 

10708 gcagaacaagcgaaaacaacagcaaacagtatcagcgcagttgcaacggcagctaaaaacacagctgattcagcacaaaaaagt 

85 AEQAKTTANS ISAVATAAKNTADSAQKS 

10792 gcaactgacctagctgttcgagtaagcagtttagaggacacagcaatacaatatactgtatcaccatag 10860 

113 ATDLAVBVSSLEDTAIQYTVLP* 
182ORF014 

13716 atgatagaatacatcacacaacggttggcagacgataaccatcttgtttatggtttgattatatggttaatggttgcaatgatt 

1 MISYITQWLADDNHLVYGLI IWLMVAMI 

13800 atcgattttgtgttaggttttacaattgccaaatccaacaaggaaatcgactttagtagttttaaagctaaagcaggtatcact 

29 IDPVLGFTIAKFNKEIDFSSFKAKAGII 
13884 . gttaaggtggcagaaatggttctagtggcttacctcactcctgtagcagtaaaattcggcgcagtaggtattacaatgtatata 

S7 VKVABMVLVVYFI PVAVKFGAVGITMYI 

13968 acaatgttggttggtttgattttatcagaaatttatagtatactaggacacatttcagaeatcgatgatgataataattggact 

85 TMLVGL1LSEIYS I LGHISOIODDMMWT 

14052 gatcatgtt aagaagttt ttagacggaacactcaacagaaaggacgatactaaatga 14108 

113 DYVKKFLDGTLNRKDDIK* 
182ORF015 

854 acggaaatcgtaaaaagcacacttgacacacaaacaccagaaggaacgttacaagCattcaacgccacaaacggggceccaatt 

1 MBIVKSTFDTQTPEGMLQVFNATNGASI 

938 ccgttacgtaacgcaattggcgaagtaccagaatcgaaagatattctagtttacccagacgaagtctctggttccggcggagcc 

29 PbRKAIGEVLELXDILVYSDBVSGFGGA 

1022 gaaccatcacaagcagaactagtcgccttcttcacagaagatggtaaaacttatgcgggtgtatcagcagtagcaacaaaatca 

57 EPSQABLVAPFTEDGKTYAGVSAVATKS 

1106 gccaaaaacctaategatacgacgactgccaaccctgacatcaaaccaaaaatttcttttgtcgaaggaaaatcaaacggcgga 

85 AKHLIDMMTANPDIKPKI SFVBG-KS_N«G 

1190 caaaaatccgtaaatctacaagtggtttcactgtag 1225 

113 QKFVHLQVVSL* 
182ORP0K 

17033 atgattaacaatttatcattaattctagagggtctaaaccaactaaccaaagatgacaacgacagtttagcgtctatcaagtca 

1 MINNLSLI LEGLNQLT X D DNDSLASIXS 

16949 gaaataacacaaggaggaaaacaactaattttatacattgaccacgttacaaaagagtccgtgttaacaeacgataaatataac 
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29 EITQGGKQLILYIDYVTKEFVLTHDKYN 

16865 eatgtttaccttgatagccattgcattaatatcgcaataacgaaatcaatgaaaagcgttgaacactatgcggaacaattgaaa 

57 YVYLDSHCINIAITKSMKSVEHYAEQLK 

16781 caegacggatacaaacaaattacggacaaatag 16749 

85 HDGYKQITDK* 
182ORF017 

154 atgaaatattcactacaacaaatagatgaaattaaaccaacaattttcagaateagattaaaaaggcatgaactagaggaactg 

1 MKYSLQQIDEIKSTIFR IRLKRHELBEL 

238 gtggacgaagtaaacgatattgctaaagatccggaggaaagatatcttttatcgttteattacaeagaagaagaacgtttgett 

29 VDEVNDIAKDPEERYLLSFYYTEBBRIiF 

322 gaaattccctctgcaagattaatagattattacaacgaaaagatcacaaatctgaaatcggaaatcatatcactcgaaaaaaga 

57 BIPSAR LIDYYNEKITNLKSEI ISLEKR 

4 06 ttacaaaaactagtaaaataa 426 

85 L Q K L V K • 
1820RPO18 

16737 atgattgcacgaacattcaaagaacaccgcgaaccaactgaacggttacgtctctaccgtaaacgtaaccttccagacaatgaa 

1 MIARTFKEHRELIEWLRFYCKRNLSDNE 

16653 aaaacagagatcatagaggggaccttacaagatttcgacgttccggaaataaatatcaccgaacttttgttaactcattcaacg 

29 KIEIIEGTLQDFDVPBINITBLI»LTHST 

16569 ctactacccgaaccgagtcaatttaacattcttgaaaagtactgtcaggcaacgaaattagtaacttcacacgtaaaagttggt 

57 LLPESSQFMILBKYCQAMKIiVTSYVKVG 

16485 t ct cgc tat cage t agegt t acaaat accaaaaggct a tt taaaggaggtggaataa 16429 

85 SRYQfcALQIPKGYLKBVB* 
1B2ORF019 

4323 atggaaattaaagaacatgaatcaattttaaatggtattcttgaaagtgtcacagacggtgaagcaagatcaaagactgtagaa 

1 MBIICBKESI LNGILBSVTDGBARSKIVB 

4407 catcttgaagcattgcgagaagactacggagcaacaactgaagctttgacatcagcaaatagcacacttgaaaagtcaaagaaa 

29 HLEALREDYGATT EALTSANSTLBKLKK 

4491 gataacgaagcgttggctatttcaaactcaaaattgttccgagaacgagcgatcgtagaaccagcagaaaataacgaaccagaa 

57 DNEAL.VISNSKLFRBRAIVEPAENNEPE 

4575 acagaccagaatattacactagacgatttaggaatttaa 4613 

85 TDQNITLDDLGI* 
182ORF030 

10158 atggcagacattagaacacaactaacaagtgaagatggatcagacaatttatttccaacttcaaaagcegttaatattatgact 

1 MADIRTQL.TSEDGSDNLFPISXAVN1MT 

10242 aatagcggtacgaatgtagaaggagaattgggtacactcaaacaaaatgacgaaacaatgaatacctcagttcaaaatgctgta 

29 WSGTNVEGELGTbKQNOETMNTSVQNAV 

10326 gttactgccaatcaagcaaaagattctgtagctgaattaaatgtaaatgttggtaaactaaccaatcgaataacaacattagag 

57 VTAHQAXD5VAELNVNVGKLTNR1TTLB 

10410 agtacagtggctaatcttgatggtattcgttatgtagaggtgtaa 10454 

85 STVANLDG1RYVBV* 
182OR7021 

17339 atgaacaataaatcattaatagctgaaaaaggagaggtatctctacttcacccctttaatgagtgggatatgaattatcatatc 

1 HNNKSIilAEKGBVSLLHPrNEWOMJIYHI 

17255 atagataccgaaaacaataaacattatcttattgatattgatgaggtaggcgatgaggaatattgtttgttatcttttgaagaa 

29 IDTEMMKHYLIDIDEVGDBEYCLLSPEE 

17171 ctaaaggaattagatatggatcttatttccgagtattcatggaaaactacagaaataacatattaa 17106 

S7 LKEliDMDLISEYSWKTTEITY • 
182OW022 

12868 gtgggttgtctaatgctaaagctgaaacgtcggaaggccaagcagagatcatcgctcaaggggataaaacaggtcaatggatgg 

1 VGCIiWUKLKRWKVKQRSSLKGIKQVMGH 

12952 ataatacacctgtctcttctgcaggttatactaaccctcagaccctttcagcatttaaacaatctgcaaatattgatgttgcta 

29 IlHLFLLQVILTLRPFQHLNNLQILKLt, 

13036 caattaattttatgtgtcactgggaacgccctggtaaacttcatatcgaagaaagacttgatcttgcacaagcttatagtaagc 

57 QLILCVTGMALVNFISKKDLILHKLIVS 

13120 atattgacggtagcggtggcggtggcgtaa 13149 

85 ILTVAVAVA* 
182ORT023 

12189 atggttgttgttttggacatgcaagttacgcgcatatcctcgataataacttacgccaccttcgattgtgttaccagaaattcc 

1 MVVVLD, MQVMRISSI ITYATFDCVTRNF 

12105 acagaaattaattacattctgataatcatcgtcattgtcgataatgatcgctgtacaaaaatgaatacggttgtttttcacaaa 

29 TEI HY I t* I IIVIVDNORCTKMMTVVPHK 

12021 gaaacctctaaaacctgtacccctagtattgatatcgttcccttgccacataccacttacatcgggaaaagctgttttgataat 

S7 BTSKTCTPSIDIVPLPHTIYIGKSCPDN 

11937 tgcecgagagacactagagaatag 11914 

85 CLRDIRE* 
182ORF024 

6174 atgcctgtaactatctcatctttaaaaacgaagaaacttatcctagtaaatggcagtatgcctttgttactgatattgaatata 

1 MLVTISSLKTKKLILVNG9MPLL-L ~N I 

6258 agaatgacaacacaagtttcgttacctttgaaattgatgttttacaaacttatcgtttcgatattggtatacgagaaagtttca 

29 RMTTQVSIjPLKLMFYKLIVSILVYBKVS 

6342 ttgcaaaagaacaccctcaactttactattcgaatggaatacctetcattaatacaactgaagagtcgcttgattacggtagag 

57 LOKNTLNFIIRMEYLSLIQLKS'RLITVE 

6426 aatacacaacaacaaatgtaa 6446 

85 NTQQQM* 
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182ORF025 

548 atgggtcgaaaactaatgcaacgaaacgtaacatcaaccaaagtagaattctcagaagttatcgtacaagatggagcgccaaca 
1 MGRKLMQRNVTSTKVEFSEVI VQDGAPT 

632 atcgcaccatgcgaaccagctgccttaacaggaaaacttccagaagaaaaagctttatcagcgaccaaacgcaaaaaccccgat 
29 IVPCEPVVLTGKLSEEKALSAIKRKMPD 
716 aaaaacgtagctgcaacaaacgcttcacatgaaacagcgctccacacaatgccagtcgacaaatttatcgagctagcagacaaa 
57 KNVVVTNVSHBTALYTMPVDKFIELADK 
800 tcaacacaagcctaa 814 

85 S T Q A * 

182OR7026 

1325 9 atggaaattatttggtctgccgtttcctgcatgcgtgccaaaaagttgtccactcacgaaacccctaggaccaagatttgtatt 
1 MEIIWSAVSCMRAKKLSTHETFRIKICI 
13175 cccgactggggttccatagcaacgtcttacgccaccgccaccgctaccgtcaatatgcttactataagcctgtgcaagatcaag 
29 LDWGSIATFYATATATVNMLT1SLCK2K 
13091 cctttcttcgatatgaagtttaccagggcgttcccagtgacacataaaattaattgcagcaacatcaatatttgcagactgttt 
57 SFFDMKFTRAFPVTHKINCSNINICRLF 
13007 aaatgctga 12999 
85 K C • 

182ORT027 

14896 atgaacatgattgtatgttcctctaatatgatcgagttgtgctggatcagacaattcaaagaccgtctctccgtcaaaataaaa 

1 MNMIVCSSNMIELCWIRQFKDRFSVK1K 

14812 cacgcttgtgctacctgttgtaagagtgaaaataaacccaccacttgtggcgtcaaatattccattttcaggaggaacaacttg 

29 HACATCCKSENKLITCGVKYPIFRRMNL 

14728 atcccgttttttacctacatacagatcccatgcattaccaccgccatagaaaacattcaaatcaagactgccgctgtatcctgg 
57 ILFFTYIQIPCITIAIEMIQIKIAVVSW 

14644 taa 14642 
85 

1820R7028 

14430 atgtccacaacaaaacaggcgtcaaagcatggttctatacgtatacagcaaacctcaacacaactgatttttctagtattgcaa 

1 MFIIKQALKHGFIRIQQTSIQLIFLVLQ 

14S14 aaggcgattatggttcatgggttgctgaatatggatcaaatcaaccacaaggctactctcaaccagcgccacctaaaacaaata 

29 KAIMVYGLLNMDQINHKATLNQRHLKQI 

14 598 atttcccaactgtcgcctgttttcagttcacaagtaaaggacgtttaccaggatacaacggcaatcttgatttga 14672 

57 I FQLLPVFSLQVKDVYQDTTAI LI * 
1820R7029 

17606 atgaatgaaccgatcgtatacacagaaatttattcaaataacgtggtatgtatgaaaatttccagagatgaggataaactcagt 

1 MNEPIVYTEIYSNNVVCMKIFRDEDKLS 

17522 aaattcctctatttagaatttgaggtggatgaggctaaaaagttacttgaaaataaaacaatctcatttgatgataactggact 

29 KFLYLEFEVDEAKKLLBNKTISFDDNWT 

17438 ttctcaataaattatccagaatattaa 17412 

57 FSINYPEY* 
1S2ORF030 

16429 atggctacattctacaaggaaccaatatacgatatcacagtattttatatagatggttgggaggttctgatacacaaaaccgaa 

1 MATFYKEPIYDITVFYIDOWEVLIHKTE 

16345 cctcccaccttaacaaaagcattaaaatatagccgtatatacctagaaatggatatagtgaaecgcgttagaatagaaagaaat 

29 PLTLTKALKYSRIYLEMD1VMCVRIERN 

16261 ggacgtcctatagctaeattttacagggaattattaaaactgtataaggagaaagaactatga 16199 

57 GRPIATFYRELLKLYKEKEL* 

182ORF031 

8603 atgccacctgaactttcaatccgctcatcgttagataagacctctgatgcttgtacacgtgcagtcttacctacgctagcattg 

1 MLPBLSICSLLDKTSDVCTRAVLSTLAL 

8519 ttgatacctagaaaagttaacacctcattccatactccgttcaattctgatcgtagtttatctactacatatggagcatttgtt 

29 LIPRKVHTSFHTSFNSDRSLSTTYGAFV 

843S tgccatacattaaaagattcgtcaaactccatatctttacccacaaaaacagcctga 8379 

S7 CHTLKDSSNSISLSTKTA* 
182OR7032 

11413 atgtttcatcaaaaacaactcgttccgggttcgtttcagggtgcaataggtaattcttttgattttcttctttcatcttgtcca 

1 MFHQKQLVSGSFQGAIGWSFDFLLSSCS 

11329 tctgaatatcaatccgtccccccacatgaacctccttattttagagggaaaacgcaatcatccagggacagacactcgatttta 

29 FEYQFVLPYEPPYFRGKTQLSRDRYSIL 

11245 cctacattgtcacctacagttactttgccatcaggtgtgatcgaeacataa 11195 

57 PTLSFTVTLPSGVIDT* 
162ORF033 

4942 atgccaacaaaaatttcttcaaccgtccgacctaaaggcatgtttccttctttaaacatttccaaagggttacgccaagatttg 

1 MSTKISSI VRPKGMFPFLN I FKGLRQDL 

4858 tatcggataactactttaccaatacggtcaactaaagttgaaataaatccgttttttactacgtctaaacgtgtgatccetgca 

29 YRITTLPIRSTKVEINSPFTTSKRVIPA 

4774 ccaaccgcttcgatgttatctgcatttggcaeaggtacgttcgcccga 4727 - 

S7 PTASMLSAFGIGTFA* 
182OR7034 

6160 gtgcttatctactctaaaaactcccccgagttgtgcatcccttcgacaagaacaatctccactctcgtraagaacaggaaacga 

1 VFIYSKHSPELCIPLIRTISILVKNRKR 

6076 atcaaagracgattcctgctcctgttgagctttaaaccatcttgtgtgcgtacaggtgtcaccaaaaggcacgttagccaacaa 

29 IKVRFLFLLSFKPSCVCIGVIKRHVSQQ 

5992 ttttacatttgtataccttcttgccataattgtccteettag S951 
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57 FYICIPSCHNCPP* 
182OR7035 

t 5750 atqgcqcataagaaaccactactctcacctcccLCCCcaataaacgtaccactaccactgacaaactcaccgccgataccaaaa 

1 M A HKKLLFLLLFS INVSLSLTNSLLILK 

156-74 tcttcqcattccgctccacgaaccaatctaccaaaaggcgcttctctcttcacttccgcaaagccttttgaatcacacaattca 

29 ss ySVPRIMLPKCVSLFTSAKSPESHHS 

15590 atcaatataccccgaccccga 15570 

57 1 n I P R S * 
1820RF036 

2315 atgtctgtgccgccttgcattttacaccactcaaaaaaagaaccgatttctaaaccgaacgccatatcgtcaacgttgtctata 

1 msvLPCILHHSKKESISKPNVILSTLSI 

2231 tcgcacacgccccacgaccatacacgacaaccgtcgagaccagctgttgtttcaaagtcgccagcatatcccttaatcataatt 

29 shtphDHTRQSLRSVVVSKSPVYPLIII 

2147 cttctcctgtttctgaattaa 2127 

57 L L Li F L N * 
182ORF037 

122B0 gtgagtcacgacaataaacatctacatcaatacaagcttgacccacatcctgaaactcaaacaaagcgtttctacttccgtatg 

1 V S YDNKHLKQYKLDPHLBTQTKRFYFRM 

12196 ccaqaaaacggccgtcgctttggacatgcaagtcargcgcacatcctcgataataactcacgccaccttcgattgtgctaccag 

29 LENGCCFGHASYAHILDHNLRHLRLCYQ 

12112 aaatttcacagaaatcaa 12095 

57 K F H R N * 
182ORP038 

14769 gtgatgagtttattttcactctcacaacaggcagcacaagcgcgttttattttgacggagaaacgacctttgaattgtccgatc 

1 VMSLPSLLQQVAQACFI LTEKRSLNCLI 

14853 caacacaactcgatcatattagaggaacatacaatcatgctcatggaaaagaaatcccatcaatggtgcggacacctgaacaac 

29 QHNSIILEEHTIMFMEKKSHQWCGHLNN 

14937 ttgatatttacctaa 14 951 

57 LIFT* 
1820RF039 

9992 atqttgctgatgatcgaacatttcggtataagattcaacgcgacaatactgattatggagccgatcctattgacacgtcacgta 

1 M L LM IEHFGIRFNATILIHEPILLTRYV 

10076 ttgtcgcaatcaataaagttagtggccggaacaccgctacaggagatatttatcttaacattaaaggaacggagggtgcataat 

29 L L Q S I KLVAGI PLQE I FI LTLKERRVYN 

10160 ggcagacattag 10171 

57 G R H • 
1820R7040 

16202 atqaqaaaaqatctcgtctacattaacacacccgatccaaaagcaaacaaaaaggcgctagcaaaaatcactaacgccaaagaa 

1 M R KDFVYINTPDPKANKKALAKITNAKE 

16118 ccaaaacaaaactatcgcagactacaattactatgctatctactattcatcattgtaatagaactaatcgtggtagctctacta 

29 PKQHYRRLQLLCYLLFIIVIELIVVALL 

16034 aaatag 16029 

57 K • 

1B2ORF041 

3886 atqgaactatataaagcaatgtttatcgcacgtgatgaaggcactactgacggttacgataccgaacactatgtagatactcct 

1 melykamf'ivrdegtidgydtehyvdis 

3970 1 1 acatgact t tgaagaaat at atggaaaagaaacacgt gaaa ttgaagcagtaacat t agcaaaaacaggaaat 1 1 aaaaaaa 

2 9 lhDFEBIYGKETREIEAVTLVKTGNLKK 

4054 taa 4056 
57 

182ORP043 

10832 gtqccctccaaactgcttactcgaacagctagatcagttgcacttttttgtgctgaatcagctgtgtttttagctgccgttgca 

1 V SS KLLTRTARSVALFCAESAVFLAAVA 

10748 accgcgctqacactgtttgctgttgtccccgcttgtcctgcggtttgttgtgctgtaccagccctcgtcaacgcttga 10671 

29 TA L I LFAVVFACSAVCCAVPAFVHA* 
182ORF043 

10652 gtgtcaatttctgtcgacaaaccagcagcagtttctttagcttgttgtgcaagtgttttagccccactagcgctacctcctatt 

1 VSISVDKPAAVSLACCASVLASLALPLI 
10568 tcagcattaactaattcaagattagagtcgcctgttagaacatttttagcaattgtaacaggcattaaaegattatga 10491 
2 9 SALTNSRLESPVRTFLAIVTG IKRL* 

1B3OSF044 

6457 atgaaaagttgtcacacttgttgccgtgcatcctccaccgtaatcaagcgactcttcaattcgtattaatgaaaggtaccccact 
r M ks C Y I C C C V P S T V I K R L F H C I M E R Y S I 

6373 cgaacaataaagtcgagggcgcccttttgcaatgaaactccctcgtataccaacatcgaaacgataagcttgtaa 6299 

29 RI I KLRVPFCNETFSYTNIETISL* 

182ORF045 

6729 acqaatqqtatacccgtatacgacgttacatacatcccgactatcttatccaaaaaaggttctctcgttgtaagaaacgccacg 
1 M No rpvYDVTYIPTILFKKGSFV-VRJiTV-M 

6645 taccctccaaaattagcactgcccgccccatttggtttgtatacctccccacttgaattgacaggaagcaaataa 6571 

29 YSPKLALPAPFGLYTS PLELIGSK* 

182OR7046 

2372 atggcttcaaatggtgcaaagaagcaaaagaagatcgaacatcctccacactcatatcaaatatgggtcaatggtatgctctgg 
1 M VSWGVK KQKKIEHSP HSYQIWVNGMLW 

2456 aaatttgttgggaagttaactacacaacaacaaaaccaggtaaaacgaaaaaagagaaacctcgaacaataa 2527 

29 KFVGKLITQQQNQVKRKKRNLEQ* 
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1820RFC47 

13353 atgctcccattgttccaacatgtgttaccgccccatcgcaacatgcaatcatttcattgccagggtgatcaattgaaccaaagc 

1 MLPLFQHVLLFHRNMQS FHCQGDQLNQS 

1326 9 ccaaaccatcatggaaaetaeteggtctgccgtttcctgcatgcgtgccaaaaagttgtccactcatga 13201 

29 PNHHGNYLVCRFLHACQKVVHS* 
182ORF048 

3395 atgtcagggcctgccccgaactctccatacaagctacccaacacacccttggcgttagcttttctagccccttcggtggtgttc 

1 MSGFVPNFPYKLFNI PLALAFLAPSVVF 

3311 tttacttcgatccatttatcgaeccagcctttgaacatatcacaagaagctttgaacatatatccgeaa 3243 

29 FTSIHLSIQPLNISQEALNIYP* 
182ORF049 

1578 atgttgcaatctcaagagcgcaaaccaaagaagcgcaaattaaaacagagcaagctcaaaaagcgaaagaagaacactacaaag 

1 MLQSQERKSKKRKLKQSKLKKR KKNTTK 

1662 agcttaacaaagttgaagctaagaagcccacagaaaacacaactgtcacaccaactattctaa 1724 

29 SLTKLKLRSPQKTQLSHQLF • 
182ORP050 

8012 atggttatcttggtttctttaaagaccctacacttgggttcatggtttgcgcaggggcagaagatggtcaaatcgatcattatc 

1 MVILVSLKTLHLGSWFAQGQKMVKSl II 

8096 acaaccctattctctetacagcaaacgaagcaatgtaccacaagagataecctgttetaa 8155 

29 TTLFSLQQTKQCITRDILF* 
182ORF051 

9390 atgcttctgaaaaagaaacaaagaacacagacattaataaagatcaaaatcaaaccaaagatacgattacacgatataaaggta 

1 MLLKKKQRTQTLIKI KIKPKIRLHOIKV 

9474 aaaagggaaacacegaetatgctgacttactcgaaaaacatcgtagaagtgtttega 9530 

29 KRETLIMLTYSKNIVEVF* 
182OR7052 

4096 gtgatagttgacaagagtcaaatccggcgagattgggcgaatgtacacgtgaaacatcgtgcgctcccgctaagtcatggacac 

1 VIVDKSQIWRDWANVHVKYRALPLSYGH 

4180 ataaacgttttgaccgtcaaccaatcgcaaaaaccttttaggagtagcccttaa 4233 

29 INVLTVNQSQKPFRSSP* 
182087053 

1S656 gtggaacagaatacgaagattttagtatcaacaatgagtttgtcaatgatagtgatacgtttattgaaaagagaagtaaaaata 

1 VEQNTKILVSTMSLSMIVIRLLKREVKI 

15740 gtagtttcttatgcgccatcgcctttgaagggaaaatctttgggcattggatag 15793 

29 VVSYAPLLLKGKSLGIG* 



132ORT054 

8136 gtgatacattgcttcgtttgctgtaaagaaaatagggttgtgataatgatcgatttgaccaccttccgcccctgcgcaaaccat 

1 VIHCFVCCKEMRVVIMIDLTIFCPCANH 

8052 gaacccaagcgtagggtctttaaagaaaccaagataaccattagtgtgtaa 8002 

29 EPKCRVFKETKIT1 S V * 
182OSF055 

8324 atgaaaagaaatacttctcactgctacaagcctacaaccaaattgacgaaaataatcaggctgtttttgtggacaaagatatgg 

1 MKRMTSHCYKLITKLTKI IRLFLWIKIW 

8408 agtttgacgaatcetttaatgtatggcaaacaaatgctccatatgtag 8455 

29 SLTNLLMYGKQMLHM* 
1B2ORP056 

6549 gtggcccatctcccttttcctactatttacttcctatcaattcaagtggggaggtatacaaaccaaacggggcaggcaatgcta 

1 VAHLLFPI IY FLSIQVGRYTNQMGQAML 

6633 attttggagagtacatggcgtttcttacaacgaaagaaccttttttaa 6680 

29 ILBSTWRPLQRKNLF* 
182OHF057 

8264 atgtccgccatatctaaagcaaaacgatgcaaacttiggtaacgtaggaactttcaagtcaccactacacaacatgatacacttt 

1 MSAISKAKRCKLGNVGTFKSLLYMHIHF 

8180 gatttatcatcatcatcatcacacctcaaaacaggacatctcttgtga 8133 

29 DLSSSSSYLKTGYLL* 
182ORF058 

5176 gcgtattcaaattcgcttacttcgccacctgtgtataaagcgtccattacaccagcaacgaaaccattgaaattaccccatgaa 

1 VYSNSLTSSPVYKAFITPATKLLKLSHE 

5092 gtaaatgctttttctaaccatgcttcttggatcgtttgtttgtag 5048 

29 VNAFSNHASWIVCL* 
182ORV059 

15876 acggtccttcgtagtcattgcacaaaaatgatttgtacctggttgataaccacaactcacacagacacaacctgtttcagcgtc 

1 MVFRSKCIKMICIWLIIITHIDTTCFSV 

15792 tatccaatacccaaagattttccccccaaaagcaatggcgcacaa 15748 

29 YPIPKDPPFKSNGA* * __ ~ 

182OKF060 - 

15404 gcgatttttgatttcccaaccaaaaactcatcaaacaaaattgcacgaacttcgggacacccactagatttttcaaccccccac 

1 VI FDFS I KNSSNKIVRTSGYSLDFSI PH 

1S320 gtactaagtggaacagcccaacccaccaatttatcatcacaatag 15276 

29 VLSGTAQPINLSSQ* 

1B2ORP061 

2102 atgaggggacttctccacccgtcccagactcgatcacccccgcaatcttactgtaaacctgttcttttttctgttgractcctg 
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1 MRGLLHLFQTRSLLQSYCXLVLFSVVLL 

2018 cttcgtcataaatgtagtcaaggttcatgtctaagaagttactaa 1974 

29 LRHKCSQGSCLRSY* 
182OR7062 

1992 acgnccaagaagccaccaacacacgtccccacaaacagatcaagccccactgagccaagtaaacgaacaataccgtcagcgtct 

1 MSKKLl>TYVFINRSSPIESSKRTISSAS 

1908 gaattgaaaacagtaaacatcgettgtctgaaattgtcgtaa 186"? 

29 ELKIVNIACLKLS • 
182ORF063 

14306 gcgcacctcccaaacccctctcatgcgcaaaatgatacacaccaatctttttacctaaagacaaagcttgttgaaacgctcggt 

1 VYLLNPSKAQNDTHQSPYLKTKLVEMLG 

14222 cacaatcagggctcacataacctgtcccgcctgttgctttaa 14181 

29 HNQGLHHLFRLLL* 
192ORF064 

73 56 atgatgctagtcaaaccaacaaaagggttgtcacttgctaaggctgaaaagatcgctcctcctgtactcatcgcaccgtctccc 

1 MMLVKPTKGLliLAKAEJCIAPPVLIALFP 

7272 ataccatgtctgaaagtattgcgaatgttttgctcttga 7234 

29 IPCLKVLRMFCS* 
182ORF065 

3S82 at.gaacgccacctgtatcacaataaataatgcgatcaaaacatttttgagcggttgcaacggtagtatatctaccccaagccgc 

1 MNAICITINNAIKTFLSGCNGSISTPSR 

3498 cacaaaactagcaagcggaacataaacaggatctcttaa 3460 

29 HKTSKRNINRIS* 
182ORF066 

4234 atgtggctactcttttttgtgtttcacagaattatgCCtcacgtgaaacagttttcatggtacaatagaatcaaaaggaggtgg 

1 MWLLFFVFHRIMFHVKQFLWYNRIKRRW 

4318 agattatggaaattaaagaacatgaatcaattttaa 4353 

29 RLWKLKNMNQF* 
1820RF067 

13882 atgatacctgcctcagcttcaaaactactaaagtcgatttccttgttaaatctggcaatcgtaaaacctaacacaaaatcgaca 

1 MIPALALKLLKSISLLNLAIVKPNTKSI 

13798 atcattgcaaccattaaccatataatcaaaccataa 13763 

29 IIATIHHI1XP* 

1820RF068 

7267 atgtctgaaagtaccgcgaatgttttgctcttgageaatcaaggagtttttgtctccttgcatgaatgcagaagcatagtcaga 

1 MSESIANVLLLSNQGVPVSLHSCRSIVR 

7183 tttaactcccacatcgctaggatcattaccgatcaa 7148 

29 FNSYIVRIIID* 
183ORJ069 

5027 gtggaacaatgtttttacatcgggaacttcctgtttaaacacccctgtaacagactcgtcagggttgaacctatgttcctgtgc 

1 VEQCFYlGNFLFKYPCNRLVRVELMFliC 

4943 aatgtcaacaaaaatttcttcaaccgttcgacctaa 4908 

29 MVNKNFFNRST* 
132ORP070 

1031 gtgacggttcggccecaccaaaaccagaaacttegtctgagcaaactagaatatcttEcaattctagtacttcgecaattgcgt 

1 VMVRLHQNQKLRLSKLEYLS I LVLRQLR 

947 tacgtaacggaategaagccccgtttgtggcaetga 912 

29 YVTELKPRLWH* 
182ORF071 

11741 atggttctgcactatggttgccacaaggcgctcaaagtggtaaaggaattttctttaatgatactcgcaattacaatcgttttg 

1 MVLHYGCHKALKVVKEFSIiMI L A I T X V L 

11825 acttcgatttgtttgctcgtaactgtactttaa 11857 

29 TLICLFVTVI** 
182ORF072 

11723 atgtttacatcaaatgccgtcaccgttccaaactttaacgtcgtctctcccgaccctaagaaagtaactacaggtacatcacgt 

1 MFTLNAVIVSMFMVVSPDPKKVTTGTSR 

11S39 ttcaattcaatggrgttagcaaagcgataa 11610 

29 PNSMVLAKR* 
1S20RF073 

2876 gtgaagccgcetttgtatgctttacgeaagtctttatcaaaccccaaagacaaaacaggaaaccactgtttgaaagttgatttt 

1 VKPPLYALRKSLSNPKDKIGNHCLXVDF 

2792 ccatgtgtagcttttagccaatctttgtaa 2763 

29 PCVAPSQSL* 
1B2ORF074 

8923 gcgattgataaactttgtctcaaaccctgcccgttttgtctcgccataaaacggataaccaaaatcaaacaattgttttcggcc 

1 VIOKFCFKFCSFCFVIKRI I KI KQLFSA 

8839 aacttcaatacgttcttttegagacaa 8813 

29 NFNTFFSR* ^ ~ 

182ORF075 

7463 gtgttacattatccggaatatttccgatatctgccactttacctgccaagaggttcaaaccgttttctttttcagaaacatagt 

1 VLHYLEYFRYLPLYLPRGSNRFLFQKHS 

7379 tgcttacctgttgtcctgcccccacga 7353 

29 CLLVVLLP* 
1820RF076 

2426 acgagtgcggagaacgctcgaccttcctecgcttctttacaccacttgaaaccaccttcgaataaceacgaaagcataaaccct 
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i msvenvrssfaslkhlkpflnnhes ins 

2342 ccgccaaatttttcgttgtggaaataa 2316 

29 PSNFSLWK* 
1820RF077 

11858 atgaaggaacgtatgccgttgtcgccagaggtagaggggctacatttgaaaactgcccattccccaatatccctcaagcaacta 
1 mkERMLLLLEVEGLHLKIVYSLISLKQL 
11942 tcaaaacagcttttcccgatgtaa 11965 
29 SKQLFPM* 
182ORF078 

7€7i gcgcctacaatatttggttcttttaattcaatgaaattccatgcttttcttgtttgtaagtttggtgtagccactcgattgccc 
1 VPTIFGSFNLMKFHAFLVCKFGVATRLL 
7587 ctcgcgccacacactgagaagtaa 7564 

29 FVPYIEK* 
182ORP07 9 

7488 gtgaaagataagcctgatccaagccgcgccacactacctggaatattctcgatatctgccaccctacctgccaagaggttcaaa 
1 VKDKFDPSCVTLSGI fsisatlpakrpk 

7404 ccgttttccttttcagaaacatag 7381 

29 PFSFSET* 
182ORP080 

4473 gcgcgctactcgccgatgccaaagccccagccgttgctccgtagtctcctcgcaatgcttcaagatgttctacaatctttgatc 
1 VCYLLMSKLQLLLRS LLAMLQDVLQSLI 

4389 ttgcttcaccgtctgtga 4372 

29 L L H R L * 
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Table 24 



Sequence similarities phage 132 and public databases 



Phage: 182 



Database: nr 



Query- aid 1 110156 | lan| 1820RF001 Phage 1B2 ORF| 5966-7780 1 2 
(604 letters) 

gi| 138124 |sp|P07534|VG9_BPPZA TAIL PROTEIN (LATE PROTEIN GP9) >. 
gi | 138123 jsp{P0433ljvG9J9P pH 2 TAIl > PROTEIN (LATE PROTEIN GP9) >- 
gij 1429238 | gnlj PID|ell73412 (X99260) tail protein (Bacteriophag . 
gi | 215339 (M124S6) p9 tail protein (Bacteriophage phi-29] >gi(2. 
gi| 1181970 |gnl|PID|e221269 (247794) caxl protein (Bacteriophage. 
gijllB1968|gnl|PID|e221267 (247794) tail protein (Bacteriophage, 
gij 2500030 J sp|QS9968 |CARA_SULSO CARBAMOYL- PHOSPHATE SYNTHASE SM. 



384 
374 
346 
208 
62 
56 
49 



e-105 
e-103 
3e-94 
8e-53 
8e-09 
6e-07 
86-05 



Query- 9id| 110157) lan |l82ORP002 Phage 182 ORF| 2152-3873 | 1 
(573 letters) 

gi| 118848 |sp|P19894|DPOL_8PM2 DNA POLYMERASE >gi | 76896 |pir| | J00 . 
gi|l429230|gnl|PIDjell73404 (X99260) DNA polymerase (Bacterioph . . . 
gi j 118849 | sp | P03680 | DPOL BPPH2 DNA POLYMERASE (EARLY PROTEIN GP . . . 
gi|ll8SSl|3p|P069S0|DPOL~BPPZA DNA POLYMERASE (EARLY PROTEIN GP... 
gi | 15732 (XS3371) DNA polymerase {AA 1-575) (Bacteriophage phi-29] 
gij 15734 (X53370) DNA polymerase (AA 1-575) (Bacteriophage phi-29] 
gi|l572479|gnl|PID|e242301 (X969B7) DNA polymerase (Bacteriopha. . 
gi|i072656|pir| |S51275 DNA polymerase - phage CP-1 >gi | 836593 |g. . 
gijll8647|sp|P22374|DPOM ASCIM PROBABLE DNA POLYMERASE >gij8385.. 
gi|461962|8p|P33537|DPOM'NEUCR PROBABLE DNA POLYMERASE >gi|2833.. 
gij 461963 | sp| P33S38 |DPOM_NEUIN PROBABLE DNA POLYMERASE >gi|l01B.. 
gi|1084487{pir| |S4X618 DNA polymerase - slime mold (Physaruas po. . 
gi|243S429 (AF012250) unaseigned reading frame (possible DNA po. . 
gij 578157 Jgnl|PID}e246743 (X52106) DNA polymerase (Neurospora i . . 
gi|2147969|pir| )S72369 probable DNA-polyraerase - Gelasinospora .. 
gi|21479«ejpirj |S62752 probable DNA- polymerase - Gelasinospora .. 
gij 3511140 (AF061244) B type DNA polymerase (Agrocybe aegerita) 
gij 118850 |sp|P10479|DPOL_BPPRD DNA POLYMERASE (PROTEIN PI) >gi| . . 
gi|S78i44 (X63909) putative DNA-polymerase, B-type (Morchella c. . 
gi(232013(3p(P30322 jDPON AGABT PROBABLE DNA POLYMERASE >gi|3208.. 



Query* 



sid| 110159 |lan|182ORF004 Phage 182 ORF| 4626-5954 | 3 
(442 letters) 



gi | 138117 jsp]P13849!VGB BPPH2 MAJOR HEAD PROTEIN (LATE PROTEIN . 
gijl3811B|spjp07S31|VG8~BPPZA MAJOR HEAD PROTEIN (LATE PROTEIN . 
gi|1429236|gnl|PID|ell734l0 (X99260) major head protein (Bacter. 
gij 1181958 jgnljpID|e221257 (247794) major head protein [Bacteri. 



Query. 



sid| 110160 | lan (182ORF005 Phage 182 ORF| 12651-13700(3 

(349 letters) 



gi|l37932|Bp[P15132|VG13_BPPH2 MORPHOGENESIS PROTEIN 1 (LATE PR. 
gi|l429242|gnl|PID|ell73416 (X99260) morphogenesis protein (sac. 
gi|137933|sp|P07538|VG13_BPP2A MORPHOGENESIS PROTEIN 1 (LATE PR. 

Query- sid| 110161 |lan| 182ORF006 Phage 182 ORF| 14995-16026 j 1 
(343 letters) 

gi|l37944|apjP11014|VG16_BPPH2 ENCAPSIDATION PROTEIN (LATE PROT. 
gij 13794S j Bp| P07541 I VG1€_BPP2A ENCAPSIDATION PROTEIN (LATE PROT - 
gi|l42 924 5|gnl|PID|ell73419 (X99260) encapsidat ion protein fBac. 
gi|1181972|gnl|PID|e221271 (Z47794) encapsidation protein IBact. 



665 
657 
654 
«54 
651 
651 
S65 
301 
71 
65 
62 
61 
61 
59 
58 
58 
57 
56 
47 
46 



402 
402 
381 
159 



0.0 
0.0 
0.0 
0.0 
0.0 

o.o 

e-160 

le-80 
3e-ll 
le-09 
le-08 
3e-OB 
3e-08 
le-07 
2e-07 
2e-07 
3e-07 
6e-07 
3e-04 
6e-04 



309 2e-83 

305 3e-82 

300 le-80 

1S2 6e-36 



52 8e-06 
48 7e-05 
47 2e-04 



e-111 
e-111 
e-105 
2e-38 



Query- Sid| 110162] lan | 182ORF007 Phage 182 ORF| 779S-877S | 1 
(326 letters) 
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gi|1429239ignl|PID|ell73413 (X99260) upper collar protein (Bact . . . 271 5e-72 

gi | 137915 | ap | P07535 | VG10_8PPZA UPPER COLLAR PROTEIN (CONNECTOR ... 256 le-67 

gij 137914 | sp] P04332 |VC10_BPPH2 UPPER COLLAR PROTEIN (CONNECTOR ... 256 2e-67 

gijll8l960|gnl|PID|e221259 (247794) conneccor protein [Bacterio. . . 148 6e-3S 

Query- sid| 110163 \ lan| 182ORF008 Phage 182 ORF| 14105-14983 | 2 
(292 letters) 

gi|4210750|gnl|PID|el374037 (AJ132604) LysL protein (Lactococcu. . . 139 2e-32 

gi j 462S59 | sp| P3402Q | LYC_CLOAB AUTQLYTIC LYSOZYME {1 , 4 -BETA- N- AC . . . 75 8e-13 

gi|23270l4 (U82823) putative lysozyroe (Saccharopolyapora erythr . . . 64 2e-09 

gi)l26652|sp|P25310|LYCM_STRGL LYSOZYME Ml PRECURSOR (1,4 -BETA- . . . 60 2e-08 

gi 1 127789 1 9p j P19386 j LYCA_BPCP9 LYSOZYME (ENDOLYSIN) (MURAMIDASE. . . 60 2e-08 

gi | 67761 |pirj |MUBPCP N-acetylrauraraoyl-L- alanine amidase (EC 3.5... 59 3e-08 

gi | 4105636 (AF049087) lys iLeuconostoc oenos bacteriophage 10MC) 59 3e-08 

gi 1 623084 (L0249S) rauramidase; muraraidaae [Bacteriophage LL-HJ 57 ie-07 

gi| 127787 | sp| P15057 | LYCA BPCP1 LYSOZYME (ENDOLYSIN) (MURAMIDASE... 57 2e-07 

gi|l26597| sp| P00721 j LYCH CKASP N, O-DIACETYLMURAMIDASE (LYSOZYME... 57 2e-07 

gi 1 127788 j 8p| P19385 j LYCA~BPCP7 LYSOZYME (ENDOLYSIN) (MURAMIDASE... 57 2e-07 

gi | 67762 | pir { |MUBPC7 N-acetylmuramoyl-L- alanine amidase (EC 3.S... 56 3e-07 

gi|302S168|ap|P76421|YEGX_ECOLI HYPOTHETICAL 32.0 KD PROTEIN IN... 53 2e-06 

gi|4204413 (AF047001) Lys44 (Oenococcue oeni temperate bacterio... 53 3e-06 

gi|2116978)gnl)PlD|dl020940 (D88151) cortical fragment -lytic en. . . 52 5e-06 

gi | 2392844 (AF011378) lyain [Bacteriophage ski) 4 8 8e-05 

Query* aid | 110164 | lan| 182ORF009 Phage 182 ORF|8765-9601|2 
(278 letters) 

gi| 1429240 |gnl|PID|ell73414 (X99260) lower collar protein (Bact... 180 le-44 

gij 137921 |ap|P04333|VGll_BPPH2 LOWER COLLAR PROTEIN (LATE PROTE. . . 171 5e-42 

gi | 215341 (M12456) pll lower collar protein [Bacteriophage phi-29) 98 9e-20 

gi j 224162 | pr£ | |1011232B protein pll, lower collar [Bacteriophage. . . 97 le-19 

gi | 535260 (Z30339) STARP antigen (Plasmodium reichenowi) SO le-05 

gi|4049753 (AF063866) ORF MSV230 hypothetical protein [Melanopl. . . 49 4e-05 

gi|21315S7jpir| (S70306 hypothetical protein YEL077C - yeast (Sa... 48 5e-0S 

gij 131782 |sp|P12753|RA50_YEAST DNA REPAIR PROTEIN RAD50 (153 KD. . . 48 7e-05 

gi|2l31309|pir| (S70305 hypothetical protein YBL113C - yeaat (Sa. . . 47 2e-04 

gi | 499325 (Z26314) STARP antigen [Plasmodium falciparum) 46 3e-04 

gi | 3845171 (AE001391) riboaome releasing factor (00, TP) {Plasm. . . 46 3e-04 

gij 731903 |spjP40434|YIR7_YEAST HYPOTHETICAL 197.5 KD PROTEIN IN... 45 5e-04 

gi|1632829lgnl|PID)e27fi379 IY08924) AARP2 protein IPlasmodium f . . . 45 5e-04 

gij 1176490 1 ap|P40889|YJW5_YEAST HYPOTHETICAL 197.fi KD PROTEIN I... 45 5e-04 

gi|1077300|pir| |S51848 hypothetical protein HRD1054 - yeaat (Sa... 45 5e-04 

gij 2425143 (AF020407) wimA (Dictyostelium diacoideum) 45 6e-04 

gi| 1181961 |gnl | PID|e221260 (Z47794) collar protein [Bacteriopha. . . 45 6e-04 

gi|2132657{pirl (S64819 probable membrane protein YLL067c - yeas... 4S 8e-04 

gij 2i3304i|pir| (S6S341 probable membrane protein YPR204w - yeas... 45 8e-04 

gij 73027 S|ap|P3 9793 |PBPA_BACSU PENICILLIN- BINDING PROTEINS 1A/1... 45 8e-04 

Query- sid| 110165 | lan | 182ORF010 Phage 182 ORF | 1310-2155 | 2 
(281 letters) 

gi | 135604 | SpjP06812|TERM_BPNP DNA TERMINAL PROTEIN >gi | 75815 | pi.. . 69 3e-ll 

gij 1572478 |gnl|PID|e242334 (X96987) terminal protein [Bacteriop. . . 65 3e-l0 

gij 1429231 jgnl | PID|ell73405 (X99260) terminal protein [Bacterio... 64 ie-09 

Query- aid | 110166 |lan|182ORF0U Phage 182 ORF| 9607-10158 | 1 
(183 lettera) 

gi | 137928 | Sp| P07537 |VG12 BPPZA PRE -NECK APPENDAGE PROTEIN (LATE... 51 6e-06 

gi|1429241)gnl|PID|ell734l5 (X99260) pre-neck appendage protein. . . 51 6e-06 

gijl37927|ap|P20345|VG12 BPPH2 PRE -NECK APPENDAGE PROTEIN (LATE... 50 le-05 



Query- aid) 110169 | lan |1B20RF014 Phage 182 ORF|l3716-14108j3 
(130 letters) 

gi|137936(9p|P11188|VG14 BPPH2 LYSIS PROTEIN (LATE PROTEIN GPU... 97 Se-20 

gij 137938j spj P07539 j VG14~BPPZA LYSIS PROTEIN (LATE PROTEIN GP14 .. . 96 8e-20 

gi} 1429243 Jgnl | PID| 61173417 (X99260) lysis protein [Bacteriopha... 96 8e-20 

gij21S332 (M14782) lysis protein (Bacteriophage phi-29] 94 Se-19 

Query- aid| 110170 |lan|182ORF01S Phage 182 0RF| 854-1225 1 2 
(123 lettera) 
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gi|15670 (V01155) reading frame 10 (may be gene 4) (Bacteriopha . . . 70 Se-12 

gi| 138072 |sp|P06953|VG5A_BPPZA EARLY PROTEIN GP5A >gi | 75836 | pir .. . 69 7 e -12 

Query- sid| 110174 | lan| 1820RF019 Phage 182 0RF| 4323 -4613 | 3 
{96 letters) 

gi |l429235|gnl|PID|e!173409 (X99260) head morphogenesis protein... 61 2e-09 

gij 138111 |sp|P13848|VG7_BPPH2 HEAD MORPHOGENESIS PROTEIN (LATE ... 57 3e-08 

gi 1 13 8112 I sp j PQ7533 j VG7~BPPZA HEAD MORPHOGENESIS PROTEIN (LATE ... 54 le-07 

Query=» sid| 110180 | lan| 182ORP025 Phage 182 ORP| S48-814 | 2 
(88 letters) 

gi| 138099 |sp|P069S5|VG6_BPPZA EARLY PROTEIN GP6 >gi | 75841 |pir| | . 55 7e-08 

gi|13809e!9p|P036BS|VG6~BPPH2 EARLY PROTEIN GP6 >gi | 75840 | pir | | .. . 54 2e-07 

gi|l429234|gnl|PIDjell73408 (X99260) gene 6 product (Bacterioph. . . 54 2e-07 



WG 00/32825 



PCT/1B99/02040 



532 
Table 25 

Homologies between 182 ORFs and proteins in public databases 



Phage: 182 
Database : Swissprot 

Query- aid | 110156 | lan| 1820RF001 Phage 182 ORF| 5966-7780 | 2 
(604 letters) 

gi|l38124|8p|P07534|VG9_BPP2A TAIL PROTEIN (LATE PROTEIN GP9) 
gijl38123|sp|P0433ljvG9_BPPH2 TAIL PROTEIN (LATE PROTEIN GP9) 
gi|2500O30|sp|QS996B|CARA_SULSO CARBAMOYL- PHOSPHATE SYNTHASE SM. . , 

Query- sid | 110157 | laa| 182ORF002 Phage 182 ORF| 2152 -3873 | 1 
(573 letters) 

gi|ll884 8|sp|P19894|DPOL BPM2 DNA POLYMERASE 

gi|ll8a4 9|sp|P03680|DPOL_BPPH2 DNA POLYMERASE (EARLY PROTEIN GP2) 

gij 118851 isp|P06950|DPOL_BPPZA DNA POLYMERASE (EARLY PROTEIN GP2) 

gi|ll884 7|sp|P22374 IDPOM^ASCIM PROBABLE DNA POLYMERASE 

gi j 461962 | sp| P33S3 7|DPOM_NEUCR PROBABLE DNA POLYMERASE 

gij 461963 jsp|P3353 8 |DPOM~NBUIN PROBABLE DNA POLYMERASE 

gij 118850 |sp|P10479|DPOL BPPRD DNA POLYMERASE (PROTEIN PI) 

gi|232013|sp|P30322|DPOM AGABT PROBABLE DNA POLYMERASE 

gij 118887 jsp|P105B2|DPOM_MAIZE DNA POLYMERASE (S-l DNA ORF 3) 



Query* 



sid|H01S9|lan|l82ORF004 Phage 182 ORF| 4626-59S4 | 3 
(442 letters) 



gi 1 138117 j sp | P13849 1 VG8 BPPH2 MAJOR HEAD PROTEIN (LATE PROTEIN 
gij 138118 | sp j PQ 7531 1 VG8_BPPZA MAJOR HEAD PROTEIN (LATE PROTEIN 



Query. 



aid 1 110160 | lan| 1B2ORF00S Phage 182 ORF| 12651-13700 | 3 
(349 letters) 



gi|137932|sp|P15132|VG13_BPPH2 MORPHOGENESIS PROTEIN 1 (LATE PR. 
gi j 137933 | sp j P07538 j VG13_BPPZA MORPHOGENESIS PROTEIN 1 (LATE PR. 

Query- aid) 110161 | lan | 182ORF006 Phage 182 ORF|1499S-16026 ( 1 
(343 letters) 

gi| 137945 |sp|P07S4l|VG16_BPPZA ENCAPSIDATION PROTEIN (LATE PROT. 
gij 137944 |spjpil014|VG16_BPPH2 ENCAPSIDATION PROTEIN (LATE PROT. 

Query- sid | 110162 | lan| 182ORF0O7 Phage. 182 ORF| 7795-8775 | 1 
(326 letters) 

gi|l37915|sp|P07535|VG10_BPPZA UPPER COLLAR PROTEIN (CONNECTOR . 
gij 137914 | spj P04332 | VG10_BPPH2 UPPER COLLAR PROTEIN (CONNECTOR . 

Query- sid | 110163 | lanj 182ORF008 Phage 182 ORF | 14105-14983 (2 
(292 letters) 

gi|462S59|sp|P34020|LYC_CLOAB AUTOLYTIC LYSOZYME (1, 4-BETA-N-AC. 
gi|l26652|8p|P25310|LYCM STRGL LYSOZYME Ml PRECURSOR (1,4 -BETA- . 
gi| 127789 | 8p| P19386 [LYCA BPCP9 LYSOZYME (ENDOLYSIN) (MURAMIDASB. 
gij 127787 |sp|PlS0S7|LYCA~BPCPl LYSOZYME (ENDOLYSIN) (MURAMIDASB. 
gij 126597 j Spj P00721 |LYCH_CHASP N , O- DIACETYLMURAMIDASE (LYSOZYME. 
gij 127788 j spj P1938sjLYCAlBPCP7 LYSOZYME (ENDOLYSIN) (MURAMIDASB . 
gij 3025168 |sp|P76421|YBGX ECOLI HYPOTHETICAL 32 . 0 KD PROTEIN IN. 



Query* 



sid | 110164 | lan| 182ORF009 Phage 182 ORF| 8765-9601 | 2 
(278 letters) 



gi|l3792l|sp|P04333|VGll_BPPH2 LOWER COLLAR PROTEIN (LATE PROTE. 
gijl31782jspjpi2753|RA50_YEAST DNA REPAIR PROTEIN RAD50 (153 KD. 
gijll76490|8p)P40689|YJN5_YEAST HYPOTHETICAL 197.6 KD PROTEIN I. 
gij 731903 |sp|P40434|YIR7_ YEAST HYPOTHETICAL 197.5 KD PROTEIN IN. 
gij 730275 j spjp3 9793 jPBPA_BACSU PENICILLIN-BINDING PROTEINS 1A/1. 
gi j 1168610 | sp| P41696 | AZF1_YEAST ASPARAGINE-R1CH ZINC FINGER PRO. 



384 e-106 
374 e-103 
49 2e-05 



665 
654 
654 
71 
65 
62 
56 
46 
46 



309 
305 



52 
47 



402 
402 



256 
2S6 



75 
60 
60 
57 
57 
57 
53 



171 
48 
45 
45 
45 
44 



0.0 

0.0 

0.0 

7e-12 

3e-10 

3e-09 

2e-07 

2e-04 

2e-04 



6e-84 
7e-83 



2e-06 
6e-05 



e-112 
e-112 



3e-68 
5e-68 



2e-l3 
5e-09 
Se-09 
4e-08 
4e-08 
5e-08 
5e-07 



le-42 
2e-0S 
le-04 
le-04 
2e-04 
3e-04 
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gi| 731587] sp|P38900|YH19_YEAST HYPOTHETICAL 70.1 KD PROTEIN IN . . . 4 4 3e-04 

Query- 3 id| 1101S5 | Ian | 182ORF010 Phage 182 QRF | 1310-2155 | 2 
(281 letters) 

gi|l3S6Q4|sp|P06812|TERM_8PNF DNA TERMINAL PROTEIN 69 8e-12 

Query- aid|H01SS | lan| 182ORF011 Phage 182 ORF | 9607-10158 1 1 
(183 letters) 

gi|l37928|sp|P07537|VG12 BPPZA PRE -NECK APPENDAGE PROTEIN (LATE... 51 2e-06 

gi j 137927 1 sp | P2034S | VG12~BPPH2 PRE-NECK APPENDAGE PROTEIN (LATE... 50 3e-06 

Query- sid| 110169 1 lan 1 182ORF014 Phage 182 ORP 1 13716-14108 1 3 
(130 letters) 

gi| 137936 | sp|P11188|VG14 BPPH2 LYSIS PROTEIN (LATE PROTEIN GP14) 97 2e-20 

gi 1 137938 I sp | P07539 I VG14~BPPZA LYSIS PROTEIN (LATE PROTEIN GP1 4) 96 2e-20 

Query- sid | 110170 | lan | 182ORF015 Phage 182 ORF| 854-1225 | 2 
(123 letters) 

gi 1 138072| Spj P06953 | VG5A_BPPZA EARLY PROTEIN GP5A 69 2e-12 

Query- sid| 110174 | lan | 182ORF019 Phage 182 ORF|4323-4613 | 3 
(96 letters) 

gi|138111|sp|P13848|VG7_BPPH2 HEAD MORPHOGENESIS PROTEIN (LATE ... 57 9e-09 

gi | 138112 jsp|P07533|VG7_SPPZA HEAD MORPHOGENESIS PROTEIN (LATE ... S4 4e-08 

Query- sid | 110180 | lan |l82ORF025 Phage 182 ORP| 548-814 | 2 
(88 letters) 

gi|l38099|sp|P0695S|VG6 BPPZA EARLY PROTEIN GP6 55 2e-08 

gijl38098|3p|P03 68S|VG6lBPPH2 EARLY PROTEIN GP6 54 5e-08 
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BLASTP 2.0.8 (Jan-05-1999) 

Query- sid| 110156 | lan| 182ORF001 Phage 182 ORF| 5966 -7780 | 2 
{604 letters) 

>gi| 138124 |ap[P07534|VG9_BPPZA TAIL PROTEIN (LATE PROTEIN GP9) 
»gi | 75849 | pir ( |WMBP9Z gene 9 protein - phage PZA 
>gi| 216058 (M11813) tail protein (Bacteriophage PZA) 
Length » 599 

Score =» 384 bics (975) , Expect - e-105 

Identities » 231/610 (37%), Positives » 344/610 (55%), Gaps . 36/610 (5%) 

Query: 6 TTJVKLU^NVPPWirniTRWFKT^ 65 

TNV++LA+VPF N V +TRWF + Q ++FNS + E ++Q ♦ V 
Sbjct: 9 T^^7KILADVP?SND7KNTRHFTSSSNQYNWFNSKTRVyEHSK^^'FQGFRENK5YlSVSLR 68 

Query. 66 KDAIiYACK^IFKNEETYPSKWQYAFVTOIEinCNDNTSFVrFEIDVIjQr^FDIGIRESF 125 

D LY +Y++F+N + Y +KW YAFVT++EYKN T++V FEIDVLQT+ F+I +ESF 
Sbjct: 69 LDLLYNAS YI MFQNAD - YGNKWFYAFVTELEYKNVGTTYVHFE I DVLQTWHFN I KFQESP 127 

Query: 126 IAKEHPQLYYSNGIPFINTIEESLDYGREYTTTNVTTFHPNDGVNFLVXLTSEAM- - PVG 183 

I +EH +L+ +G P INTI+E L+YG BY +V P O + FLV+++ M G 
Sbjct: 128 IVREHVKLWITOIXJTPTIOTIDEGI^GSEYDIVSVENHRPYDDMMFLWISKSIKHGTAG 187 

Query: 184 DKEDKSG - - -GSI VGGPS PFSYYLLP INSSGEVYKPN - GAGNANFGEYMAFLT TKEP 236 

+ E ♦ S+ G P P YY* P G+V K G NAN + LT ++ + 

Sbjct: 188 EAE5RX^It^I^GMPQPUnrfIHPFYKDGKVPKTFIGD^ANI^PIV>^TNIPSQKS 247 

Query: 237 FLNXIVGMYVTSYTGI FFIVDBWnCTVRYNAGGSYKJMLFTYASDPTGTMKTFAFFCVKE 296 

+N IV MYVT YG+ ++.ADG + T VK+ 
Sbjct: 248 AVNNIVNMYVTDYIGLXLOYXNGDKBLKIiDKDMFEQAGI ADDKHGNVDTIF VKK 301 

Query: 297 ARTFVPKKEDLVGNVYNYFREAFFFNVKESKLFM^YCLIEITDTKGHVM^ 356 

♦ ID G+ + F + +ESKL MYPYC+ B+TD KG+ M L+ BY+ 

Sbjct: 302 IPDYETLEID-TGOKHGGFTKD QESIO«HYPYCVTBVTDFKGNHMNLKTEYIDNN 355 

Query: 357 KLSVYVKGSLGISNKVMIBPIDYDVSNSTI ITNLSDXMLIDNDPNDVGVXSDYASA 412 

KL ♦ V+GSLG+SNKV DY+ S +T D LI+N+PND+ + +DY SA 

Sbjct: 356 KLKIQVRGSUT^SNKVAYS I QDYN AGGSLSGGDRLTASLDTSLINKNPNOI Al INDYLSA 415 

Query: 413 FMQGNKNSLIAQEQNIP^FRliGKC^SAMSTGGAIFSAIJ^NNPFV^^ 472 

++QGNKNSL Q+ +1 6M tS G ♦+ +PP ++♦ G N 
Sbjct: 416 YLQGNKNSLENQKSSILFNGIVGMLGGGVSAG ASAVGRS PFGLASS VTGMTSTAGN 471 

Query: 4 73 YVSEKENGLNLLAGKVADIENIPDNVTQLGSNLSFTTGN- FQNYYQLRFKQI KYEYATRL 531 

V + + L K ADI NIP +T++G N +F GN ++ Y ♦+ KQ+K BY L 
Sbjct: 472 AVLD MQALQAKQAD I AN I P PQLTKMGGNTAFD YGNGYRG VYVI K - KQLKAE YRRSL 526 

Query: 532 DRYFSMYGTKSNRVATPNI^RKAWNPIKLKEPNIVGTMSNDVLTRVKQIFSAGVTLWHT 591 

+F YG K NRV FNL+TREA+N+I + K+ I G + +N + L ♦+ IF G+TLWHT 
Sbjct: 527 SSFFHKYGYKITOVKJ<PNLRTRXAYNYIQTKDCFISGDINNNDLQEIRTIFDNGITLWHT 586 

Query: 592 NDVLNYNQDN 601 

*D* NY+ +N 
Sbjct: 587 DOIGNYSVBN 596 



Query- aid| 110157 |lan| 182ORF002 Phage 182 ORF|21S2-3873|l 
(573 letters) 

>gi|llB848|sp|P19894lDFOL_BPM2 DNA POLYMERASE >gi | 76896 | pir| (JQ0161 
DNA-directed DNA polymerase (EC 2,7.7.7) - phage M2 
>gi | 215509 (M33144) DNA polymerase (Bacteriophage M2] 
Length - 572 

Score - 665 bits (1697). Expect « 0.0 

Identities - 327/569 (55%), Positives - 420/S89 (70%), Gaps • 38/589 (6%) 



Query: 3 KKYTGDPETTTOLNDCRWSWGVCDIDNVDtWTFGLEID 62 
K ♦+ OFETTT L+DCRVW++G +1 N+DN G +0 F ♦W M+ D+YFHN KP 
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SbjCC: 4 KMFSCDFETTTKLDDCRVWAYGYMEIGNLDNYKIGNSLDBFMQWV-MEIQADLYFHNLKF 62 

Query: 63 DGEFMLSWLFKNGFKWCKEAKEDRTFSTLISbWGQWYALEICWEVNVXXXXXXXXXXXXX 122 

1X3 F+++WL ++GFKW E <• 7++T+IS MGQWY ++IC+ 
Sbjct: 63 DG A F I VNWLEQHG FKWSNEGL PN - T YNT IIS KMGQWYM I D I CFGYK GKRKL 112 

Query: 123 XXIIYDSLKKYPFPVKQIAEAFNFPIKKGEIDYTKERFIGYKPTKDEWEYLKNDIQIMAM 182 

+IYDSLKK PFPVK+IA+ F P+ KG+IDY ERP+G++ T +E+EY+KNDI+I+A 
Sbjct: 113 HTVIYDSLKKLPFPVKKIAKDFQLPLLKGDIDYHTERPVGHE1TPEEYEYIKNDIEIIAR 172 

Query: 183 ALKIQFDQGLTRMTRGSDALGDYKDWLKATHGKSTFKQWFPILSLGFDKDLRKAYKGGFT 242 

AL IQF QGL RMT GSD+L +KD L F + FP LSL DK++RKAY+GGFT 

Sbjct: 173 ALDIQFKQGLDRMTAGSDSLKGFKDILST KKFNKVFPKIiSLPMDKEIRKAYRGGFT 228 

Query: 243 WVNKVFQGKEIGDGIVFDVNSLYPSQMYVRPLPYGTPLFYEGEYKPNNDYPLYIQNIKVR 302 

W+N ♦+ KEIG+G+VFDVNSLYPSQMY RPLPYG P+ ++G+Y+ + YPLYIQ 1+ 
Sbjct; 229 WLNDKYKEKEIGEGMVFDVHSLYPSQMYSRPLPYGAPIVFQGKYEXDEQYPLYIQRIRFE 288 

Query: 303 FRLKEGYIPTIQVKQSSLFIQNEYLESSVNKLGVDELIDLTLTKVDLELFFEHYDILEIH 362 

F LKEGYIPTIQ+K++ F NEYL++S GV B ++L LTNVDLEL EHY+-+ + 
Sbjct: 289 FELKEGYI PTIQIKKNPFFKGNEYLKNS GV-EPVELYLTHVDLELIQEHYELYNVE 343 

Query: 363 YTYGYMFKASCDMF1CGWIDICWIEVKNTTEGARKANAKGMLNSLYGKFGTNPDITGKVPYM 422 

Y G+ F+ +FK +IDKW VK EGA+K AK MLNSLYGKF +NPD+TGKVPY+ 
Sbjct: 344 YIMFKFRBKTGI^KDFIDKWTYVKTHEEGAKKQIJaa^ 403 

Query: 423 GEDG I VRLTLGEEELRDPVYVPLASFVTAWGRYTTITTAQKCFDRI I YCDTDS IHLVGTE 482 

♦DG + +G+EE +DPVY P+ F+TAW R+TTIT AQ C+DRI I YCDTDS IHL GTE 
Sbjct: 4 04 KDDGS LG FRVGDEE YKD P VYT PMGVF ITAWARFTT ITAAQACYDR 1 1 YCDTDS I HLTGTE 463 

Query: 4 83 VPEAIDHLVDPKKLGYWGHBSTFQRAKFIRQKT YVEEIDGEL 524 

VPB I ♦VDPKKLGYW HBSTF+RAK++RQKT YV+E+DG+L 
Sbjct: 464 VPEIIKDIVDPKXI^YWAHESTFKRAKYUIQKTYIQDIYVICEVDGKIJCECSPDEATTTXF S23 

Query: 525 NVKCAGMPDRIKEIVTFDNFBVGFSSYGKLliPKRTQGGVVLVDTMFTIK 573 

+VKCAGM D IK+ VTFDNP VGFSS GK F ♦ GGWLVD++FTIK 
Sbjct: 524 SVKCAGKTDTIKIOCVTFDNFAVGFSSMGKPKPVQVNGGVVLVDSVFTIK 572 



Qtt«ry> sid\ 110159 | lan| 182ORF004 Phage 182 ORF j 4626-S9S4 | 3 
(442 letters) 

>gi| 138117 | Bp |P13849|VG8_BPPH2 MAJOR HEAD PROTEIN (LATE PROTEIN GPB) 
>gi|7584S|pirJ|WMBP89 gene 8 protein - phage phi-29 
>gi| 215325 (M14782) major head protein [Bacteriophage 
phi-29] >gi | 225362 jprf | |1301270B gene 8 [Bacillus sp.J 
Length « 448 

Score - 309 bite (783), Expect * 2e-83 

Identities » 176/440 (40%), Positives - 250/440 (56%), Gaps o 27/440 (6%) 

Query: 4 KI TEQDVLRATNVET P VQLMTAI YNS S SSLFQANVPMPNADN I EAVGAG I TRLD WKNE F 63 

+IT DV + +• ++ AI NS F++ VP* A+N+ VGAGI V+N+F 

SbjCt: 2 RITFNDVKTSLGITESTOIVNAIRNSQ^NFKSYVPIATANNVABVGAGILINQTVQNDF 61 

Query: 64 ISTLVDRIGKWIRYKSW^PIJCMFKXGNMPLGRTIEEIFVDIAQEHKFNPDESVTGVFK 123 

I++LVDRIG WIR S NPLK FKKG +PLGRTIEBI+ DI +E ++♦ +E+ VF+ 
SbjCt: 62 ITSLVDRIGLWIRQVSLNNPLKXFKKGQIPLGRTIEEIYTDITKEKQYDAEEAEQKVFE 121 

Query: 124 QEVPDVTCIXPHEINRXGYYKQTIQEAWLEKAFTSTONFN^ 183 

♦B+P+VKTLFHB NR+G+Y QTIQ+ L+ AF SW NF SFV+ ++NA+Y EV E+BY 
Sbjct: 122 REMPNVTCTLPIIBRNRQ/SFYHQTIQDDSIJCTAFVSWGNFESFVSSIINAIYNSABVTJEYBY 181 

Query: 184 TKLLIANYQEKELFKEI EIGEITBSNA- - KEFIRKIKSTSNKLEFM - - SSAYNAQGVKTS 239 

KLL+ NY K LP ++I B T S EF++K+++T+ KL S +N+ V+T 

Sbjct: 182 MXlXVDNYYSiCGLFTTVTCIDEPTSSTGALTBPVKXMRATARKLTL 241 

Query: 240 TSK^DQYXXXXXXXXXXXXXXXXXXXFNMSKTDFVGHKIVIDEFPKKEGEESSNIVAVIV 299 

+ D + FNM++TDF+G+ VID F S+ ♦ AV+V 

SbjCt: 242 SYMEDLHLI I D ADLEAE LDVD VLAKAFNMNRTDFLGNVTVI DG F ASTGLEAVLV 295 

Query: 300 DSEWFMIYDKLYKTTSLYNPBGLYWNYWLHHHQLYSTSQFGNAvTVFVKSATKPVTKV 359 

D +WFM+YD L+K ++ MP GLYWNY* H Q S S+F NAVAFV VT+V ♦ 

Sbjct: 296 DKDWFMVYDNIJKMETV^PRGLYWNYYYHWQTI^VSRFAKAVAFVSGDVPAVTQVIVS 3SS 

Query: 360 SATTSVVKGSSKDIALTFTPVEATNC^SWSSAPALV^^ 419 
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♦V +G + V ATN ♦ V V G +T * G 
Sbjct: 356 PNIAAVKQGGQQQFT- - - AYVRATNAKDHKV VWSVEGGSTGTAI TG 398 

Query: 420 QS LVTFTA I GGQQATVLVTV 43 9 

L++ + Q TV TV 

Sbjct: 3 99 DGLLSVSGNEDNQLTVKATV 418 



Query- sid| 110160 | lan| 182ORF00S Phage 182 ORF| 12651-13700 | 3 
(349 letcers) 

>gi | 137932 j 3p | P15132 | VG13_BPPH2 MORPHOGENESIS PROTEIN 1 (LATE 

PROTEIN GP13) >gi| 758S8 |pir | |WMBP23 gene 13 protein - 
phage phi-29 >gi | 215331 (M14782) morphogenesis protein 
{Bacteriophage phi-29] >gij 225368 |prf | | 1301270H gene 13 
(Bacteriophage phi-29) 
Length « 365 

Score - 51.5 bits (121), Expect - 8e-06 

Identities » 44/166 (26%). Positives - 70/166 (41%), Gaps - 14/166 (8%) 

Query: 6 NEQIARGQTIAKILSKYGYNKNSQVGVVANLHWBSA GLNPNSNEXXXXXXXXX-QWT 61 

+ E Q I LS G+ K + G++ U+ ES GL N +E QWT 

Sbjct: 12 SEMKVNAQYILNYLSSNGIOTXQAICGKLGNMQSESTINPGLWQNL^ 71 

Query: 62 PKSNLYRQAQICGLSNAKAETLEGQAEI IAQGDICTGQWMDNTPVSSAGYTNPQTLSAFKQ 121 

P SN A GL ++ II + + QW++ ++ Y K 
Sbjct: 72 PASNYINWANSQGLPYKDMDS- -ELIGJIIWEVNNNAQWINLRDMTPKEY IKS 121 

Query: 122 SANIOVATINFMCHWERPGKLHIEERLDLAQAYSKHIOGSGGGGVK 167 

+ ♦ F+ +ERP + ER D A+ ♦ K++ G GGGG++ 

Sbjct: 122 TXTPRELAMIFLASYERPANPNQPERGDQABTWYKNLSGGGGGGLQ 167 



Query- sid|110161|lan|182ORF006 Phage 182 0RF| 1499S-16026 | 1 
(343 letters) 

>gi| 137945 |sp|P0754l|VG16_BPPZA ENCAPS I DAT ION PROTEIN (LATE PROTEIN 
GP16) >gij 7S861|pir| |WMBP16 gene 16 protein - phage PZA 
>gi | 216065 (M11813) morphogenesis protein C 
[Bacteriophage PZA] 
Length « 332 

Score - 402 bits (1023) , Expect « e-ill 

Identities - 186/332 (56%), Positives a 244/332 (73%), Gaps - 2/332 (0%) 

Query: 11 E KNLYYN PNNALG FNC LMLFV I GARG I GKTYGY KKFWNR F I KHGEQ FI YLRRFKTE LKK 70 

♦K+L+YNF L ** ++ FVIGARGXGK+Y K + +NRFXK+GEQFXY+RR+K EL K 
Sbjct: 2 DKSLFYNPQKMLSYDRILNFVIGARGIGKSYAMKVY'PINRFIKYGEQFIYVRRYKPELAK 61 

Query: 71 IPQPFXTMAKBFPDHKLEVKGKEFYCDDKLhKWAVPLST^ 130 

+ +F +A+EFPDH+L VKG+ FY D XL GWA*FLS W EKSN YP V TI+FDBF+ 
Sbjct: 62 VSNYFNDVAQE7PDHELVVKGRRFYIDGKLAGWAI PLSVWQS EKSNAYPNVSTI V7DEFI 121 

Query: 131 XEKSKITYLPNEAEALLNMMETVTRRRTNTRCVTtLSNATSWNPYFLYFNL^ 190 

EK Y+PNE ALLN+M+TVFR R RC+ LSNA SWNPYFL+FNL PD+NKRFN 
Sbjct: 122 REKDNSNYIPNEVSALLNLMDTVFRNRERvTKTICLSNAVSV™^ 181 

Query: 191 LYQDRGILIEIjCCSKDFAEVTCRETPFGRLIRGTEYEDFSINNEFVNDSDTFIEKRSKNSS 2 50 

♦Y D LIB+ DS DP+ +R«T FGRLI CTEY + S++N+F+ DS FIEKRSK+S 
Sbjct: 182 VYDD--ALIEIPDSLDFSSERRKTRFGRLIDGTEYGEMSLDNQFIGDSHVFIEKRSICDSK 239 

Query: 251 FLCAIAFEGKircYWIDAETGCVYVSYDYQPNTNHFYA>^ 310 

F+ +1 ♦ G G W+D G +YV + P+T * Y +TT D EN +L+ N++NNY+L 
Sbjct: 240 FVFSIVYNGFTLGVWvTJvNC£LMYVT)TAHDPSTKN^ 2 99 

Query: 311 STVAKAFKNSYLRFDNIVIKNLHYDLFNKMKI 342 

+A AF H YLRFON VI+K+ Y+LF KM+I 
Sbjct: 300 RKLASAFMNGYLRFDNQVIRNIAYELFRKMRI 331 



Query- sid| 110162 | lan| 182ORF007 Phage 182 ORP| 7795-8775 | 1 
(326 letters) 

>gij 1429239 | emb| CAA6765 8 | (X99260) upper collar protein 
(Bacteriophage B103) 
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Length = 308 
Score - 271 bits (685), Expect - 6e-72 

Identities = 131/275 (47%). Positives » 187/275 (67%), Gaps - 5/27S (1%) 

Query: 36 YYEHYRRQLTLLTFQLFEWENLPKSIDPRYLEIALHTNGYIX3FFKDPTLGFMVCAGAEDG 95 

+Y HY + L L +QLFEWE LP S+DP YLE ♦+H GY+GF+KDP +G+ + C GA G 
Sbjct: 22 WYYHYYQYLCSLAYQLFEWERLPPSVDPSYLEKSIHQFGYVGFYKDPRIGYIACQGALSG 81 

Query: 96 QIDHYHNPI F FTANEAMYHKRYPVLRYDDDDDKSKC I MLYNNDLKVPTLPS LHRFALDMA 155 

+DHY+ P FA* Y + + YD +-K-+ + t-YNNDLK TLP+L FA D+A 
Sbjct: 82 TVDHYNLPDRFHAS SVG YQNTFKLYNYSDMKEKNMGVAI YNNDLKCSTL PALEMFAQDLA 141 

Query: 156 OINQISRWRRAQKTPVIIQTDEKKYFSL1X5AYNQIDENNQAVFVDKDMEFDESFNVWQT 215 

+1 VN+ AQKTPV+I SL YNQ + N +-FV + ++ D + V++T 

Sbjct: 142 ELKEIIAWQNAQKTPVTilAA^NNQLSLKNiraQYEGMAPVIFVTlESLDLD-NLKVFKT 200 

Query: 216 NAPYVVDKLASELNEVWKEATLTFlXSIIMAtrVDKTARVQTSEVLS^NEQIESSGNZ LLKSR 275 

+APYWDKL ++ N VWNEV+T+LGI NAN++K R+ TSEV SN+EQIESSGNI LK+R 
Sbjct: 201 DAPYVVDKLNAQKNAVWHSVT1TYI£XKNANL£KK£RMVT^ 260 

Query: 276 KEFCDRVNRVFGDELDGKIDVKFRTDAVRQLQLAA 310 

♦E C++++ ++G L VKFR D V Q++L A 

SbjCt: 261 QEACNKISELYGLNL KVKFRYD I VEQMRLMA 291 



Query- aid J 110163 | lan 1 182ORF008 Phage 182 ORF| 14105-14 983 1 2 
(292 letters) 

>gi|4210750{emb|CAAX0710| (AJ132604) LysL protein (Lactococcus 
lactis] 
Length - 235 

Score - 139 bits (347). Expect - 2e-32 

Identities > 85/210 (40%). Positives . 114/210 (53%), Gaps - 14/210 (6%) 

Query: 2 MNGIDISSYQTGIDLSKVPCDFWIKATGGTGYVNPDCDRAFQQALSLGKiaGVYHFAHE 61 

MNGIDISSYQ ++ VP DFV IKAT GT Y+NP ♦ Q +• K *Q YHFA 
SbjCt: 1 MKGIDISSYQAEIJIAGIVPSSFVXIKATEGl^INPTWEEQAGQVIQTIIKLIiGFYHFAS- 59 

Query: 62 RGLEGTPQQEAQPFLDNIKGYTGKAVLILDFEGS - - NQKDVNWAXAFliDYVYNKTGVKAW 119 

G P EA FF+ +K YIGKAVL+LDFE N A+ FL+ V KTO+ 

SbjCt: 60 VGNPIABADFFISVVKNY1GKAVLVLD7EAGAINAHGNVGARQFLNRVKEKTGINPM 116 

Query: 120 FYTYTANLNTTDFSS IAXGDYGLWVAEYGSNQPQGYSQFAPPKTNN FPIVACFQF 174 

Y + ++S+I+ + LWVA+Y SPGY +PT+ ♦ A Q«- 

SbjCt: 117 IYMSSDVTRQFNWSTISSTN- PLWVAQYASMNPTGYQ- - SEPWTDGKGYGAWSSAAIHQY 173 

Query: 175 TSKGRLPGYNGNliDLNVFYGDGNTWDLYVG 204 

+S G L ++GNLD+N+ Y + N W G 
SbjCt: 174 SSAGS LSNWSGNLD I NLAY I NANQNKS LAG 203 



Query- sid| 110164 | lanj 182ORP009 Phage 182 ORP| 8765-9601 j 2 
(278 letters) 

>gi| 1429240 |e«b|CAA676S9| (X99260) lower collar protein 
(Bacteriophage B103) 
Length -2 93 

Score - 180 bits (451) , Expect - le-44 

Identities » 115/296 (38*), Positives - 161/296 (53%). Gaps - 33/296 (11%) 

Query: 3 UCRYIBSPTVYQPBLSRKERIEVORKQLFDFDYPFYDBTKRAEFETKFINHFYLREIGSE 62 

L YIE ++ Y+ LS E+IE GR +LFDF YP +-DE+ R FET FI +FY+RBIG B 
Sbjct: 8 LSTYI EMWSQYKTGLSMAEKI EKGRPKLFDFQYP I FDESYR1CVFETHFIRNFYMRKIGFB 67 

Query: 63 TMGSPKFNLDEYLNLNWPYVmKWFT^NLEEP-PIFTJDMDYTID^ ^21 

T G FKFNL+ *L +NMPY+NK+F S L +♦ P+ ♦ T K+ DT NR 

Sbjct: 68 TEGIJKFWLETWLIINMPYFNKLFESELIKYDPLENTRLNTTGNKKN DTERNDNR 122 

Query: 122 D ES KNQTKQ VDQTDNRNXNTRDTGTT DSFSRKTYTDTPQKDLRIASNG 169 

D + K+ TK D+T+ + D TT D+F+R +D P L + +N 
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Sbjct: 123 DrrGSMKADGKS^KTSDXTOATGSSKEDGKTTGSVTDDNTORXIOSDQPDSRLNLTTN- 181 

Query: 170 DGTGVIMYRTNITEOLSKETTSSTGVETNNDiCTNQNTRStfAS EKETKNTD 219 

DG G ♦ YA+ 1 B+ + ++TG TNN +♦ + S S T N 

Sbjct: 182 DGQGTLEYAS AI EENNTNNKRNTTG - - TNNVTS S AES ESTGSGTSDTVTTDNANTTTNDK 239 

Query: 220 INKDQNQTXITC'ITRYKGKXGNTDYADLLEXYRRSVIjRIEKMIFREMNXEGLFLLVY 275 

*U N +D I GKG YA L++ YR ++LRIEK IF EM * LF+LVY 
Sbjct: 240 IJJSQINNVEDYIESKIGKSGTQSYASLVQDYRAALLRIEXRXFDEMQE - - LFMLVY 293 



Query- sid| U016S| lan| 182ORF010 Phage 102 ORF| 1310-2155 | 2 
(281 letters) 

>gi|13S604|sp|P06812|TERM_BPNF DMA TERMINAL PROTEIN 

>gii 75815 jpir| JERBPNP terminal protein - phage NF 
>gij 579177 } erabj CAA68440 | (Y00363) gene E product (AA 
1-267) (Bacteriophage NF] 
Length = 266 

Score - 74.9 bits (181), Expect - Se-13 

Identities » 73/275 (26%), Poaitives - 129/275 (46%). Gapa « 37/275 (13%) 

Query: 3 VRJSKNDRAKLEXIYGKSNXARXKYNRLRQK-GVB- - -ERQLPTVPTSKKRLIDYVKSTN 58 

+RI + MD+A K+ K* KA K +R ++K G++ E +LP + ♦ + 
Sbjct: 7 IRITNNDKALYAKI*V-KNTKA--KISRTXKKYGIDLSNEIELPPLESFQ 52 

Query: 59 MSRSDFNKMLDELVDFAQPYNENYIFEINKRJA/AISRAQIKEAQIKTEQAQXAKEEHYXE 113 

+R +FNK + F N+NY F NX ♦ S+A+I E T++AQ+ +B +E 
Sbjct: S3 -TREEFNKWKQKQ£SPTNRAyQNYQFVKNKYGIVA5KAKINEIAKOTKEAQRIVT3EQREE 111 

Query: 119 L NKVEVKKPTENTIVTPTILTELGADLP FQAI PDFNXDAFTSPEGVQSYLEM 170 

+ K + I++P+ +T G P DFN D S +++ E 

Sbjct: 112 IEDKPFISGGKQQGTVGQRMQILSPSQVT--GISRP SDFNFDDVRSYARLRTLEEG 165 

Query: 171 IG-XQDEQYTOERDQLYYDNFRQAMFTIFNSD-^ADDIVRLLDSMGIjDLFMXTYVSNFIjD 227 

+ X Y+D R * NF + ♦ FNSD +D++V L + DF + Y+ P + 
SbjCt: 166 MAEKASPDYYDRRMTQ>fflQNFIEIVEKSFNSDWLSDELVERLXKIPPDDFFELYLM-FDE 224 

Query: 228 MNLDYIYDEABVQQKKBQVYSKIAKVIESETGGBV 262 

++ +Y E B + B * +KI ++ G+V 
Sbjct: 225 ISFEYFDSEQEDVEASEAMLNK1HSYLDRYERGDV 259 



Query* aid | 110166 ) lan |182ORF011 Phage 182 ORF| 9607-10158 | 1 
(183 letters) 

>gij 1429241) erabj CAA67660 | (X99260) pre-neck appendage protein 
(Bacteriophage B103] 
Length - 860 

score m so.8 bits (129), Expect « 6e-06 

Identities - 29/105 (27%), Positives - 56/105 (52%), Gaps - 6/105 (5%) 

Query: 8 KKFIX3LPAVFKERFSICYT'HTEYRYELLLOEEVSALIAYLNEVGALVND 67 

+RF+ L + + + +YT+ + L E+++ +1 YLN++G L ND+ N +E 
Sbjct: 7 RRFEXLGEMMVQVYERYLPTAFDESMTLLEKMNKIIEYI^QIGRLTOT 66 

Query: 68 V-EKI^ITNDTIJaCWLSDGTLENLINDTVFANYIKEIXRLQILV 111 

+ + LE+ +TL+KW +G +L+ I E+K+ + V 

Sbjct: 67 LNDGLEDYVXBTLBXVYSEjGXFADLV IQVIDELXQFGVSV 106 

Query- aid| 110169 | lan| 182ORF014 Phage 182 0RF| 13716-14108(3 
(130 letters) 

>gi( 137936 |sp|P11188|VG14_BPPH2 LYSIS PROTEIN (LATE PROTEIN GP14) 
>gi| 75860 ipir|TwMBM9 gene 14 protein - phage phi-29 
>gi| 15678 |emb|CAA28631| (X04962) gene 14 product (AA 
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1-393) (Bacteriophage phi- 29) >gi| 225369|prf | | 1301270J 
gene 14 (Bacteriophage phi-29] 
Length « 131 

Score - 96.7 hits (237), Expect - 6e-20 

Identities = S3/131 (40%), Positives » 61/131 (61%), Gaps - 3/131 (2%) 

Query; 1 MIEYITQWL-ADDrmLVYGLIIWIJ*VAMIIDFVLGFTIAKFNKEIDFSSPKAKAGIlVKV 5 9 

MI ++ +L D+ L+Y L +LMV M++D VLG AK N I FSSPK K G+++KV 
Sbjct: 3 M I A WMQHF LETDETKLI YWLT - FLWVCMWDTVLG VLFAKLNPNIKFS S FKI KTG VL I KV 61 

Query: 60 AEMVLWYFI PVAVKFGAVGITMYITMLVGLILSE I YSI LGHISDIDDDNNWTDYVKKFL 119 

+EM+L + IP AV F A G+ + T+ L +SEIYSI GH+ +DD ++* + p 
Sbjct: 62 SEMI LALLAX PFAVPFPA-GLPLLYTVYTALCVSEI YSI FGHLRLVDDKSDFLEI LENFF 120 

Query: 120 DGTLNRKDD I K 130 

T + + K 
Sbjct: 121 KRTSGKNXEEK 131 



Query- sid| 110170 | lan| 182ORF01S Phage 182 ORF| 854-1225 I 2 
(123 letters) 

>gi|15670|emb|CAA24483| (V011S5) reading frame 10 (may be gene 4) 
(Bacteriophage phi-29] 
Length > 124 

Score - 69.9 bits (168), Expect = 6e-12 

Identities « 39/119 (32%), Positives - 64/119 (53%), Gaps - 3/119 (2%) 

Query: 3 IVKSTPDTQTPEGMLQVCTIATNGASIPLRHAI-GEVLEIJCDILVYSDEVSGFGGAEPSQA 61 

IVK+TFDT+T EG +++FNA G +N G ++E I Y +0 A+ + 

Sbjct: 6 I VKATFDTETLEGQI KI FNAQTGGGQSFKNLPDGT I IEAKAIAQYKQVSDTYGDAK- - EE 63 

Query: 62 ELVAFFTEDGKTYAGVSAVATKSAXNLIDMMTANPD X KPKI SFVEGKSNGGQKFVNLQV 120 

+ F DG Y+ +S ++A +LID++T + K+ V+G S+ G P +LQ+ 

Sbjct: 64 TVTTIFAAIXISLYSAISKTVAEAASDLIDLVTRHKLETFKVKWQGTSSKGNVPPSLQL 122 



Query, aid) 110174 | lan | 182ORF019 Phage 182 ORP|4323-4613 |3 
(96 letters) 

>gi | 142923S | emb| CAA67654 | (X99260) head morphogenesis protein 
(Bacteriophage B103) 
Length - 101 

Score - 60.9 bits (145), Expect » le-09 

Identities - 34/96 (35%), Positives » 53/96 (54%), Gaps - 5/96 (5%) 

Query: 1 MEIKRHBSIUrail^SVTDGEARSKIVEHLEAIJlEDYGATTEALT^ 60 

M8 HB ILN + + + R+++ L+ LR DYG+ + S EKL+ +N 

Sbjct: 3 MEHDSHEEILNKLNDPELEHSERTEL LQQLRADYGSVLSEFSELTSATEKLRAENSD 59 

Query: 61 LVISNSKLFRERAIVEPAEN- -NEPETDQNITLDDL 94 

L++SNSKLFR+ I +• E ♦ E + IT++0L 
Sbjct: 60 LIVSNSKLFRQVGITKEKEEBIKOEELSETITIEDL 95 



Query- aid) 110180 | lan| 1820RP02S Phage 182 ORF( 548-814 | 2 
(88 letters) 

>gi 1 138099 |sp|P069S5|VG6_BPPZA EARLY PROTEIN GP6 

>gi|7584ljpirj |ERBP6Z gene 6 protein - phage PZA 
>gi | 216047 (M11813) gene 6 product [Bacteriophage PZAJ 
>gi|224746|prf | (1112171K ORF 6 [Bacteriophage PZAJ 
Length - 96 

Score - 55.0 bits (130), Expect - 8e-08 
Identities - 28/79 (35%), Positives - 45/79 (S6%) 
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Query: 4 KLMQRNVTSTKVEFSEVIVQDGAPTIVPCE PWLTGKLSEEKALSAI KRKNPDIOfVWTN 63 

K+MQR +T T V DC * G LS E-fA +KRK + V V + 

Sbjct: 3 KMMQR EITKTTVNVAKMVMVDGEVQVEQLP S ET FVGNLSMEQAQWRMKRKYKGE PVQ WS 62 

Query: 64 VSKETALYTMPVDKFIELA 82 

V T +Y +PV+KF+E+A 
Sbjct: 6 3 VEPNTEVYELPVEKFLEVA 81 
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Table 26 

Secondary structure prediction for ORF 182ORF008 

1 MMNGIDISSY QTGIDLSKVP CDFVNIKATG GTGYVNPDCD RAFQQALSLG KKIGVYHFAH 

CCCCCCCCCC CCCCCCCCCC CCEEEEEECC CCCCCCCCCC HHHHHHHHHC CCCCEEEEEE 

61 ERGLEGTPQQ EAQFFLDNIK GYIGKAVLIL DFEGSNQKDV NWAKAFLDYV YNKTGVKAWF 

CCCCCCCCHH HHHHHHHHHC CCCCEEEEEE CCCCCCCHHH HHHHHHHHHH HCCCCCEEEE 

121 YTYTANLNTT DFSSIAKGDY GLWVAEYGSN QPQGYSQPAP PKTNNFPIVA CFQFTSKGRL 

EEECCCCCCC CCCEECCCCC CEEEEECCCC CCCCCCCCCC CCCCCCCEEE EEEECCCCCC 

181 PGYNGNLDLN VFYGDGNTWD LYVGKKQDQI VPPENKI FDA TSDEFIFTLT TGSTSVFYFD 

CCCCCCCCEE EEECCCCCCE EEECCCCCCC CCCCCCCCCC CCCEEEEEEC CCCCEEEECC 

241 GETIFELSDP TQLDHIRGTY NHVHGKEIPS MVWTPEQFDI YLKMYEKKPV YK 

CCEEEECCCC CCHHHHCCEE CCCCCCEECC CCCCCCCHHH HHHHHCCCCE EC 



Secondary structure prediction for ORF 182ORF014 



1 MIEYITQWLA DDNHLVYGLI IWLMVAMIID FVLGFTIAKF NKEIDFSSFK AKAGIIVKVA 
CCCCEECCCC CCCCHHHHHH HHHHHHHHHH HHHHHHHHHC CCCCCHHHHH HHHCEEEEEE 
61 EMVLWYFIP VAVKFGAVGI TMYITMLVGL ILSEIYSILG HISDIDDDNN WTDYVKKFLD 
EEEEEEEECC CEEECCCEEE EEEEEEEEEE EEEEEEEECC CCCCCCCCCC CEEEEEEECC 
121 GTLNRKDDIK 
CCCCCCCEEC 
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Table 27 

Enterococcus accession numbers 242/242 



gij289575 1 |gb| AF044978. 1 (AF044978 [289575 1] 

gi|4803755|dbj|AB026843.1|AB026843 [4803755] 

gi|476900 1 |gb|AF 1 40549. 1 1 Af 1 40549 [4769001] 

gi|4760901 |gb| AF099088. 1 |AFO99088 [476090 1 ] 

gi|4704705|gb|AF12 1254. 1 |AF1 2 1254 [4704705] 

gi|3342117|gb|AF076604.1|AF076604 [3342117] 

gi|4688824|embfAJ132470.1|ESP132470 
[4688824] 

gi|4732085|gb|AF125553.1|AF125553 [4732085] 

gi|4732082jgb|AF125552.1|AF125552 [4732082] 

gi|4732079|gb| AF 1 2555 1 . 1 |AF 1 2555 1 [4732079] 

gi|4732076|gb|AF125550.1|AF125550 [4732076] 

gi|4732073|gb|AF125548.1|AF125548 [4732073] 

gi|4732070|gb|AF125547.1|AF125547 [4732070] 

gi|4732067|gb|AF125546.1|AF125546 [4732067] 

gi|4732064|gb[AF 125545. 1 [AF 1 25545 [4732064] 

gi|4732061jgb(AF125544.1|AF125544 [4732061] 

gi|4704653|gb|AFl 14715.1|AF1 14715 [4704653] 

gi|4704564|gb|AF102550. 1 (AF102550 [4704564] 

gi|4688827|cmb}AJ238249.1{EFA238249 
[4688827] 

gi|46806O6|gb|AFl 25 1 98. 1 |AF125 198 [4680606] 

gi|4633279|gb|AFl 17609. 1 |AF1 17609 [4633279] 

gij4633124|gb|AF110130.1|AFl 10130 [4633124] 

gi|4590399|gb[AF124258.1|AF124258 [4590399] 

gi|4590336|gb[AF108380.1|AF108380 [4590336] 

gi|4590335|gb|AF108379.1 [AF108379 [4590335] 

gi|4019167|gbfU21300.1|CXU21300 [4019167] 

gi|4545 122Jgb|AF0778 1 6. 1 1 AF0778 1 6 [4545 1 22] 

gi[44336i0|gb|AF1066 14.1 |AF106614 [4433610] 

gi|4468838|emb|AJ132039.1|EFA132039 
[4468838] 

gi|4468121|emb|AJ132958.1|BPH132958 
[4468121] 

gi|4456104|emb|Y17302.1|EHI17302 [4456104] 
gt|4433611|gbJAF106615.1|AF106615 [4433611] 
gi|4433607|gb|AF106611.1|AF106611 [4433607] 



gi|4098267|gb|U766H.liBLU76614 [4098267] 
gi|470 1 9lemb| Y00 116.1 |SF AMB 1 [470 1 9] 
gi|4 1581 79|emb| AL035206. 1 |SC9B5 [4 1 58 1 79] 
gi|4165458|emb|X79343.1|EF16SSPA [4165458] 
gi|4 1 65457|cmb|X79342. 1 jEFTRNALA [4 165457] 
gi|4165456|emb|X7934 1 . 1 IEF23SRNA [41 65456] 
gi|4 1 50978|cmb| Y 1 4027. 1 jEFY 1 4027 (4 1 50978] 
gi[4127803|emb|AJ223I61.1|EFAJ3161 [4127803] 
gi|2956685|embf Y 1 64 1 3 . 1 IEFENTIJO [2956685] 
gi|2665346|cmb|Yl 3922. 1 |EHY13922 [2665346] 
giJ4324675!gb|AF109375.1|AF109375 [4324675] 
gi|4234627|gb|AF061013.1|AF061013 [4234627] 
gi|4234626!gb|AF061012. 1 |AF061012 [4234626] 
gi|4234625|gb[AF06101 1.1|AF061011 [4234625] 
gi|4234624|gblAF061010.1|AF061010 [4234624] 
gi|4234623|gb|AF061009.1|AF061009 [4234623] 
gi|4234622|gb|AF061008.1|AF061008 [4234622] 
gi|4234621|gb|AF061007.1|AF061007 [4234621] 
giJ4234620|gb{AF061006.1|AF061006 [4234620] 
gi|4234619|gb| AF06 1 005. 1 1 AF061 005 [42346 19] 
gi|4234618|gb|AF061004. 1|AF061004 [4234618] 
gi|4234617|gb|AF061003.1|AF061003 [4234617] 
gi|4234616|gb|AF061002.1|AF061002 [4234616] 
gi|4234615|gb|AFO61001.1|AF06100i [4234615] 
gif4234614|gb|AF061000.1|AF061000 [4234614] 
gi|3 138990|gb|AF060241 . 1 JAF06024 1 [3 138990] 
gi|3138986|gb|AF060240.1(AF060240 [3138986] 
gi|4204535|gb|AF094803. 1|AF094803 [4204535] 
gi|4204534|gb|AF094802. 1 |AF094802 [4204534] 
gi|4204533|gb|AF094801 . 1 |AF094801 [4204533] 
gi|4204532|gb|AF094800. 1|AF094800 [4204532] 
gi|420453 l|gb(AF094799. 1 1 AF094799 [42045 34 } 
gi|4204530|gb|AF094798. 1 |AF094798l4204530] 
gi|4204529|gb|AF094797. i|AF094797 [4204529] 
gi(4204528|gb|AF094796. 1 1 AF094796 [4204528] 
gi|4204527|gb|AF094795.1|AF094795 [4204527] 
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gi|4204526|gb|AF094794.1|AF094794 [4204526] 

gi|4204525|gb|AF094793.1|AF094793 [4204525] 

gi|4204524|gb|AF094792. 1 JAF094792 [4204524] 

gij4204523|gb| AF09479 1 . 1 j AF09479 1 [4204523] 

gi|4204522|gb| AF094790. 1 ! AF094790 (4204522] 

gi |42045 2 1 |gb| AF094789. 1 1 AF094789 [420452 1 ] 

gi|4204520|gb|AF094788. 1 1 AF094788 [4204520] 

gi|42045 19|gb|AF094787. 1 JAF094787 [42045 19] 

gi[42045 18|gb| AF094786. 1 1 AF094786 [42045 1 8] 

gi|42045I7|gb|AF094785-l|AF094785 [4204517] 

gi|4204516|gb|AF094784.1|AF094784 [4204516] 

gi|4204515|gb|AF094783.1|AF094783 [4204515] 

gi|42045 141gb|AF094782. 1 IAF094782 [42045 14] 

gi|42045 13|gb|AF09478 1 . 1 |AF09478 1 [42045 1 3] 

gi|4204512|gb|AF094780.1|AF094780 [4204512] 

gi|3873186|gb[AF034779.1|AF034779 [3873186] 

gi|4151367|gb|AF093508.l|AF093508 [4151367] 

gi|2828136|gb|AF039903.1fAF039903 [2828136] 

gi|2828 1 35|gb| AF039902. 1 |AF039902 [2828 1 35] 

gi|2828134|gb|AF039901.1|AF039901 [2828134] 

gi|2828133|gb|AF039900. 1|AF039900 [2828133] 

gi|2828132igb|AF039899.1|AF039899 [2828132] 

gi|2828 13 1 |gb| AF039898. 1 |AF039898 [2828 131] 

gi|4 103866|gb| AF028812. 1 |AF0288 12 [4 103866] 

gi|4103864|gb|AF028811.1(AF028811 [4103864] 

gi|2605925|gb|AF029727. 1 |AF029727 [2605925] 

gi|1402750|gb|U60038.1|EFU60038 [1402750] 

gi|l 835780|gb|U86375. 1 |EFU86375 [1835780] 

gi|383 1 555|gb| AF047608. 1 |AF047608 [3 83 1 555] 

gi|379O617|gb|AF097414.1|AF097414 [3790617] 

gi|3767587|dbj|AB0O5036.1|AB0O5036 [3767587] 

gij3757810|gb|AF042288.1|AF042288 [3757810] 

gi|3747039|gb|AF093509. 1 JAF093509 [3747039] 

gi|3660559|dbj|AB01781 1.1|AB01781 1 [3660559] 

gi|l 147743|gb|U4221 1. t[EHU4221i [1 147743] 

gi|3676412|gb|AF051917.1|AF051917 [3676412] 

gi|3676I64|emb|AJ01 11 13.1IEFA01 1 1 13 
[3676164] 

gi|26 1 2869|gb[ AF005726. 1 1 AF005726 [26 1 2869] 
gi|2353762|gb|AF016233.1|AF016233 [2353762] 



gi|2 1 49899|gb|U94707. 1 (EFU94707 [2 149899] 
gi[2 1 49 1 49|gb|U82366. 1 ILSU82366 (2 1 49 1 49] 
gi|1469463fgbjU49512.1|EFU49512 [1469463] 
gi|1244503|gb|U35366.1|EFU35366 [1244503] 
gi|833854|gb|U26268. 1 [EFU26268 [833854] 
gi|84 1200|gb|U 1 893 1 . 1 |CPU 1 893 1 [84 1 200] 
gi[460079|gb|U00457. 1 |U00457 [460079] 
gi|460077|gb|U00456.1|U00456 [460077] 
gi|535661|gb|U4675.1|INSTRANSPO [535661] 
gi|3023041|gb|AF007787.1|AF007787 [3023041] 
gi|43 1 1 24|gb|L 1 5633.1 |TRN9 1 6ENT [431124] 
gi|38 8 1 06|gb[L23802. 1 |ENEEBSA [388 1 06] 
gi|36083 87{gb| AF07 1085.1 1 AF07 1085 [3608387] 
gi|355 1851 |gb| AF076027 . 1 |AF076027 [355 1 85 1 ] 
gi|355 17731gb|U94770. 1 |SPU94770 [355 1773] 
gi|355 1 743|gb|U57498. 1 |ECU57498 (355 1 743] 
gi|3243 1 78|gb|AF0630 1 0. 1 |AF06301 0 [3243 1 78] 
gi|3136316|gb|AF063900.1|AF063900 [3136316] 
gi|3540256jgblAF052459.1|AF052459 [3540256] 
gij755215|gb|U17696.1|LLU17696 [755215] 
gi]342 1 437|gb|AF082295. 1 |AF082295 [342 1437] 
gi|3421436|gb|AF082294.1|AF082294 [3421436] 
gi|3421435|gb|AF082293.1|AF082293 [3421435] 
gi|3421434|gb{AF082292.1|AF082292 [3421434] 
gi|3341430|emb|Y17797.1|EFY17797 [3341430] 
gi|3319647|emb|X69092.1[EHFBP3RA [3319647] 
gij3292886|emb|AJ007584. 1 |EFA7584 [3292886] 
gi|3261536|emb|AL021958.1|MTV041 [3261536] 
gi|3250708|cmbjZ95 1 50. 1 |MTCY 1 64 [3250708] 
gi|3249688|gb|AF070678. 1 |AF070678 [3249688] 
gi|3249687|gb| AF070677. 1 (AF070677 [3249687] 
gi|3249686|gb|AF070676. 1 1 AF070676 [3249686] 
gi|32191581dbj|AB015233.1|AB015233 [3219158] 
gi|2765275|cmb|Y12924. 1 |SP Yl 2924 [2765275] 
gi|3 1 83687|emb| Y 11621. ljEAl 6SRRN [3 1 83687] 
gi]2765274|emb|Y12923.1|EFYtf 923 [2765274} - 
gi|2765273|emb!Y12922. HESY12922 [2765273] 
gi|2765272|emb|Y12921.1[ESY12921 [2765272] 
gi|2765271|emb|Y12920.1|EDY12920 [2765271] 
gi|2765270|cmb|Y12919.1|ESY12919 [2765270] 
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gi|2765269|emb| Y 1 29 1 8. 1 |ECY1 29 1 8 [2765269] 
gi|2765268|emb|Y12917.1|ECY12917 [2765268] 
gi|2765267|emb[Y 1 29 1 6. 1 |EPY 1 29 1 6 [2765 267] 
g i|2765266|emb[Yl2915.1|ESY129l5 [2765266] 
gi|2765265 jcmbl Y 1 29 14. 1 |ERY 1 29 1 4 [2765265] 
g ij2765264|emb|Yl2913.1[EMY129l3 [2765264] 
gi|2765263|emb|Y12912.1|EHY12912 [2765263] 
gi}2765262|cmb|Y12911.1|EMYl2911 [2765262] 
gi|2765261[erab|Y1291O.l|EGY12910 [2765261] 
gi|2765260|cmb[Yl 2909. 1|EDY1 2909 [2765260] 
gi|2765259|emb|Y12908.t|ECY12908 [2765259] 
gi|2765258|emb|Y12907.1|EAY12907 [2765258] 
gi|2765257|emb|Y12906.1|EFY12906 [2765257] 
gi|2765256|emblY12905.1|EFY12905 [2765256] 
gi|289454 1 |emb|AJ223332.1|EFAJ3332 [289454 1] 
gi|2894539|emb|AJ22333 1 .1 |EFAJ333 1 [2894539] 
gi|3108058|gb|AF060881.1|AF060881 [3108058] 
gi|3087776|emb|AJ223633.1|EFAJ3633 [3087776] 
gi|3080754|gb|AF016483.1|AF016483 [3080754] 
gi|2197119|gb|AF003921.1|AF003921 [2197119] 
gi|2982722!dbj|AB0122l3.1|AB012213 [2982722] 
gi|2982721|dbj|AB012212.1|AB0122i2 [2982721] 
gi[2058780|gb!B07890.1|B0789O [2058780] 
gi|2058779|gb|B07889.1|B07889 [2058779] 
gi|2058778|gb|B07888. 1 |B07888 [2058778] 
gi|2058777|gb|B07887.11B07887 [2058777] 
gi|2058776|gblB07886. 1 |B07886 [2058776] 
gi|2O58775|gb|B07885.1|B07885 [2058775] 
gi|2058774|gb|B07884.1|B07884 [2058774] 
gi|2058773lgb[B07873. 1 (B07873 [2058773] 
gi|2058772|gb|B07872.1|B07872 [2058772] 
gi|205877 1 |gb|B0787 1 . 1 |B0787 1 [2058771 ] 
gi[2058770|gb|B07870. 1 |B07870 [2058770] 
gi|2058769|gb|B07869. 11B07869 [2058769] 
gi|2O58768|gb|B07868.1|B07868 [2058768] 
gi|20587671gb|B07867.1|B07867 [2058767] 
gi|2058766|gb|B07866.1|B07866 [2058766] 
gi|2058765|gb|B07865.1|B07865 [2058765] 
gi|2058764|gb|B07864. 1 1B07864 [2058764] 
gi|2058763fgb|B07883.1|B07883 [2058763] 



gi|2058762|gb|B07882.1|B07882 [2058762] 

gi|2058761|gb|B07881.1|B07881 [2058761] 

gi|2058760Igb|B07880.1!B07880 [2058760] 

gi|2058759|gb[B07879.1|B07879 [2058759] 

gil2058758|gb|B07878.1|B07878 [2058758] 

gi|2058757|gb|B07877. 1 |B07877 [2058757] 

gi|2058756|gb|B07876.1|B07876 [2058756] 

gi|2O58755|gb|BO7875.1|B07875 [2058755] 

gi|2058754|gb|B07874.1|B07874 [2058754] 

gi|2058753|gb|B07863. 1|B07863 [2058753] 

gij2058752|gb|B07862. 1 [B07862 [2058752] 

gi|205875 1 |gb|B0786 1 . 1 |B07861 [205875 1 ] 

gi|2058750|gb|B07860. 1 |B07860 [2058750] 

gi|2058749|gb|B07859. 1 JB07859 [2058749] 

gi|2058748|gb|B07858.1|B07858 [2058748] 

gi|2058747|gb|B07857.1|B07857 [2058747] 

gi|2058746|gbiB07856. 1 |B07856 [2058746] 

gi|2058745|gb|B07855.1|B07855 [2058745] 

gi|2058744|gb|B07854.1|B07854 [2058744] 

gi|2058743jgb|B07853. 1 JB07853 [2058743] 

gij2058742|gb|B07852.1|B07852 [2058742] 

gi|2058741 |gb|B0785 1 . 1 [B0785 1 [205874 1] 

gi|2058740|gblB07850.1jB07850 [2058740] 

gi|2947527|gb|T25933. 1 |T25933 [2947527] 

gij2924302|cmb|X81655. ljEHERMAM [2924302] 

gi|2664256|erab|Y12234. 1 |EFAS48C [2664256] 

gi|28799061dbj|D85752. 1 |D85752 [2879906] 

gi|27462 16|gb|AF028836. 1 |AF028836 [2746216] 

gi|2745825|gb|AF039 139.1 |AF039 1 39 [2745825] 

gi|2696019|dbj!AB007844. 1 |AB007844 [2696019] 

gi|48999|cmb|X62280.1|EHPBP5G [48999] 

gi|2654477|gb|U89914.1|BFU89914 [2654477] 

gi[43347|emb|X68646.1|EHPSRAA [43347] 

gi|2613034|gblAH0O5624.1|SEG EDDH4RR 
[2613034] 

gi{26i3033Jgb|AF029775. 1IEDDH4RR2 [2613.033] 

gi|2613032|gb|AF029774.1|EDDH4FJU [2613032] 

gij26l3031|gb|AH005623.1|SEG_EDDHIRR 
[2613031] 

gi|2613030|gb|AF029773.1|EDDHIRR2 [2613030] 
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gi|26 1 3029|gb|AF029772. 1 JEDDHIRRl [2 6 1 3029] 

gi|26 1 3028|gb|AH005622. 1 |SEG_EDH 1 9RR 
[2613028] 

gi|26 13027|gblAF029771 . 1 |EDH 19RR2 (26 1 3027] 

gi|2613026|gb|AF029770.1|EDH19RRl (2613026] 

gi|2613025|gb|AH00562U|SEG_EDISRR 
[2613025] 

gi|2613024|gb|AFO29769.l|EDISRR2 [2613024] 

gi|26 1 3023|gb|AF029768. 1 |EDISRR1 [26 1 3023] 

gi| 1 88 1 226|dbj| AB00 1 488. 1 [ABOO 1488 [1881 226] 

gi|25471 60|gbj AF023 1 04. 1 f AF023 1 04 (2547 1 60] 

gi|2547159|gb|AF023103.1|AF023103 [2547159] 

gi|2547l58|gb|AF023I02.1|AF023102 [2547158] 

gi|2547157|gb|AF023101.1(AF023101 [2547157] 

gi|2415383|gb|AF015775.l|AF0l5775 [2415383] 

gi|2388636|gb|U94356.1|EFU94356 [2388636] 

gi|2388634|gbjU94355.1|ECU94355 [2388634] 

gi|2340825|dbj|D26045.l|D26045 [2340825] 

gi|2226147|emb|Y14080.1|BSY14080 [2226147] 

gi|2327026|gblU87997.1[EFU87997 [2327026] 

gi|2318058|gb|AFO12532,l|AF012532 [2318058] 

gi|1848175|emb|X87189.1|EM23S5SSP (1848175] 

gill848174|emb|X87187.1|EM16S23SS [1848174] 

gi|1848173|emb|X87188.1|EM16S23SP [1848173] 

gi|1848l72|emb|X87185.1|EH23S5SSP [1848172] 

gi|1848171jemb|X87184.1|EH16S23SS [1848171] 

gi|1848170Jcmb|X87181.1|EF23S5SSP [1848170] 

gi|1848169|emb|X87183.1|EF23S5SPA [1848169] 

gtjl848168|cinb|X87191.1|EF23S5SAC [1848168] 

gi|1848l67|cmb|X87180.1|EF16S23SS [1848167] 

gi|1848166|emb|X87182.1!EF16S23SP [1848166] 

gi|1848165|emblX87l90.1|EF16S23SC [1848165] 

gi|1848164|emb|X87186.1|EF16S23SA [1848164] 

gi|1848156jemb|X87I79.1|ED23S5SSP [1848156] 

gi|1848155|emb|X87178.1|EDI6S23SS [1848155] 

gi|1848154|cmb|X87!77.1|ED16S23SA [1848154] 

gi|2274942|cmb|AJ000346. 1|EHNAPBC [2274942] 

gi|2274939fcmblAJ0O0O42.1|EFGLS24B 
[2274939] 

gi|414575lgb|L12710.1|ENEAAC [414575] 
gi|2245603|gb|AF006008. 1 [AF006008 [2245603] 



gi|223 1 992|gb|U94530. 1 (EFU94530 [223 1992 ] 

gi|223 1990|gb|U94529. 1|EFU94529 (223 1990] 

gi|223 1988|gb|U94528. 1 |EFU94528 [223 1988] 

gi[2231986|gb|U94527.1|EFU94527 [2231986] 

gi|223 1984|gb|U94526. 1IEFU94526 [223 1984] 

gi|223 1982|gb|U94525. 1 1ECU94525 [223 1 982] 

gi!2231980|gb|U94524.1|ECU94524 [2231980] 

gi|2231978|gb|U94523.1|EOJ94523 [2231978] 

gi|22319761gb|U94522.1|ECU94522 [2231976] 

gi|2231974|gb|U9452U!ECU94521 [2231974] 

gi|2196685|gb|U25090. 1|EFU25090 [2196685] 

giJ2197120|gb|AF003922. i|AF003922 [2197120] 

gi|2196683|gb|U25095.1|EFU25095 [2196683] 

gi|2196681|gb[U25094.1|EFU25094 [2196681] 

gi|2196679|gb|U25093. 1|EFU25093 [2196679] 

gil2196677|gb[U25092.1|EFU25092 [2196677] 

gi|2196675|gb|U25091.1|EFU25091 [2196675] 

gi|2196673|gb|U24682.1|EFU24682 [2196673] 

gi|532533|gb|U09422.1|EFU09422 [532533] 

gi|487271|dbj|D17462.1|ENENTP [487271] 

gi|468459|dbj|D28859.1|ENEPPDl [468459] 

gi|440135|dbj|D16334.1(ENEATPK [440135] 

gi|391680|dbj|D13816.1|ENENAABS [391680] 

gi| 1402524jdbjlD78257. 1 (D78257 [1402524] 

gi|709995|dbj|D30808.1|BACYCB20 [709995] 

gi|2 1 09265|gb|U9 1 527. 1 |EFU9 1 527 [2109265] 

gi|1041 1 12|dbj|D78016.1|ENEPPDlA [10411 12] 

gi|1339880|dbj|D85392.1|ENERPA [1339880] 

gi| 1 339878 |dbj|D85393 . 1 1 ENEGE 1 E [ 1339878] 

gi|662918|cmb|Z46807.1|EHCOPA'YZ [662918] 

gi|769796|cmb|X86176.1|EFRPODDNE [769796] 

gijl854638|gb|U51479.1[EGU51479 [1854638] 

gi| 1 85722 1 |gb|U72706. 1 JEFU72706 [ 1 85722 1 ] 

gi| 1 8572 1 9|gb|U72704. 1 |EFU72704 [ 1 8572 1 9] 

gi|1857217|gb|U72705.1|ECU72705 [1857217] 

gi|1272655|cmb|X96978.1|EFPPDlGNS.I12'72655] 

gi|1272652|emb|X96976. 1 [EFPLSEP1G [1 272652] 

gi|1279406|emb|X96977.1|EFPADlORF 
[1279406] 

gi|1070l49|cmb|X9321 1.1|EFTNF01 [1070149] 
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gi|1065723|emb!X92947.1|EFTETMGN [1065723] 

gi| 10 1 9639|gb|L38972. 1 |PH4COINJN [ 1 0 1 9639] 

gi| 1 1 5 1 1 5 J |gb|U43087. 1 JEFU43087 [ 1 1 5 11 S 1 ] 

gi|lO98507jgb|U17283.l[BMUl 7283 [1098507] 

gi|1498072|gb|U64887.1|EFU64887 [1498072] 

gi|1498071|gb|U64886.1|EFU64886 [1498071] 

gi|1469783|gb|U58049.1|EHU58049 [1469783] 

gij 1 763666|gbjU8 1452. 1 |EFU8 1452 [1763666] 

gi|6246941gb|L38973.1|PH4SEQ [624694] 

gi|1730458Semb|Z83305.1|EFVANRES [1730458] 

gi|1419498|emb|X84796.1)ECPFW4 [1419498] 

gifl4l9497|emb|X84795.l|ECPFW3 [1419497] 

gi|1419496[cmb|X84794.1|ECPFWl [1419496] 

gi|2544O0|gb|S43266.1|S43266 [254400] 

gij239025|gb|S66277.11S66277 [239025] 

gi|I05493t|gb[U3859O.l|EFU3859O [1054931] 

gi|1244573lgb[U39788.1|EHU39788 [1244573] 

gi|1244571|gb[U39789.1|EGU39789 [1244571] 

gi|1244569|gb|U39790.1IEFU39790 [1244569] 

gi|1255020|gb|U39777.1(ESU39777 [1255020] 

gi|12550l8|gb|U39775.1|EPU39775 [1255018] 

giJ1255016|gb[U39778.1|EDU39778 [1255016] 

gi|1255014|gb}U39776.1|ECU39776[1255014] 

gi|1255012|gbfU39774.1|EAU39774 [1255012] 

gi|1619922|gb|U69267.1|IVU69267 [1619922] 

gi|790436|cmb|X84861.1|EFEFMPBP5 [790436] 

gi|790434jcmbjX84858.1|EFD63RPSR [790434] 

gi|790432|cmbtX84862.1|EF721PBP5 [790432] 

gi|7904301emb|X84860.1|EF63RPBP5 [790430] 

gil7904281emb|X84859.1|EF366PBP5 [790428] 

gi|1572800|gb|U70854.i|CELF38A5 [1572800] 

gi|1041816|gb|U17153.1|EFU17153 [1041816] 

gi[1086523|gb|U39859.1|EFU39859 [1086523] 

gi[403564|gb|U01917.1|EFU01917 [403564] 

gijl515474|gbIU66286.1|EFU66286 [1515474] 

gi|1513068|gb{U15554.1|LMU15554 [1513068] 

gi|1296520|emb|X9418U|EFENTAORF 
[1296520] 

gill488069|gb|U63997.1(EFU63997 [1488069] 
gi|1209525|gb|U35369.1|EFU35369 [1209525] 



gi| 1 46934 1 |gb|U3093 1 . 1 (ESU3093 1 [1469341] 
git48833 1 |gb|M77276. l|S YNGIP2 122 [48833 1 ] 
gi|1046l77|gbjU39733.1| [1046177] 
gi| 1 2366 1 3 Jgb|U49939. 1 |CVU49939 [ 1 2366 13] 
gi|4749 1 |emb[X55766. 1 [SS 1 6SR5G [47491 ] 
gi|47490|emb|X55767. 1 |SS1 6SR3G [47490] 
gi|47061|embJX56353.1|SFTET916 [47061] 
gi|49022{emblX62755. 1|SFNPRG [49022] 
gi|47047|emb|X17214.11SFPASAl [47047] 
gi|47044|emb|X68847.1|SFNOXAA [47044] 
gij47033|emb|V0l547.1|SFKANR [47033] 
gi|47018|erob|X02027.1|SF5SRNA [47018] 
gi|51 1044|emb|X75752.l|MP16SRNA0 [51 1044] 
gi|511043|emb|X7575U|MP16SR243 [511043] 
gi|886481)einb!X82819.1|ESPLPAM [886481] 
gi|5 1 7387|cmb(X76 1 77. 1 |ES 1 6SRR [5 1 7387] 
gi|472916|cmb|X769l3.11EHNTPOP [472916] 
gi|43351|cmbpC55133.1|ES16SRRN [43351] 
gi|U43442jcmbjX92687.l|EFPBP5G [1143442] 
gt|963032|emb|Z50854.1|EHARPQTOU [963032] 
gi|886479|embiX84818.1|EHDNAPSR [886479] 
gil55l4371emb}X81654.1(EHIS1216 [551437] 
gi|4678G5|erab|X78425.1|EFPBP5 [467805] 
gi|296721|emb|X55961.1|EFPD78 [296721] 
gi|287946|emb|Zl9137.1|EFPTSHGN [287946] 
gi|49042|cmb|X63285.1|EHNAKA [49042] 
gi|49019|embpC62658.1|EFSEAl [49019] 
gi|43337|cmb|212296.1|EFSPREG [43337] 
gi(43335(emb|X56895.1|EFPVANAG (43335] 
gi|43333|emb|X1642U|EFPF54 [43333] 
gi|43331|emb|X62657.1|EFORF3 [43331] 
gi|1065721|cmb|X92945.1|EFCAT501 [1065721] 
gi|80655 1 |emb|Z49243. 1 |EF4 1 IOSOD [80655 1 ] 
gi|806549|emb|Z49244.1lEF4105SOD [806549] 
gi|505530]cmblX79542.11EFAS48 [505530] 
gi|43323|embJX62656.1|EFASPh[43322]i - 
gi|40840|emb|X56422.1|EC16SRNAG [40840] 
gi|48189|emb|X04388.i|TN1545TR [48189] 
gi|928814|gb|L40841.1[ENETRANSPO [928814] 
giJ141856jgb|L01794.1|AD!REPABC [141856] 
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gi| 1 491 25|gb|M90647. 1 |IP8 V ANY [ 149 1 25 ] 
gi|141862|gb|M87836.l|ADlTRAEl (141862} 
gi|l41860(gb|M84374.1|ADlTRAA [141860] 
gi|141853|gb|M62888.1|ADlPADl [141853] 
gi|U01637|dbj|D31674.1|EVM16RNA7 [1101637] 
gif 11 0 1 636|dbj JD3 2 675 . 1 {ENE 1 6RN AS (1101636] 
gi|497792|dbj|D31676.1!ENC16RNA9 [497792] 
gij 1 022729 jgb]U36 1 95.1 |EFU36 1 95 [ i 022729] 
gi|48833 8|gb|M77279. 1 |S YNGIP3 124 (488338] 
gi|488335|gb|M77278.1|SYNGIP2563 [488335] 
gi[488333[gb|M77277.1|SYNGIP2124 [488333] 
gi|488329|gb|M77275.1|SYNGIP2l21 [488329] 
gi|388267|gb|Ll 9532. 1 1 AD ITRAC (388267] 
gi|4930161gb|U03756.1|EFU03756 [493016] 
gil453536|gb|L28754.1|INSTRAN [453536] 
gi|153658|gb|M58002.1|STRHYDROLA [153658] 
giK75427|gblU00681.1|EFU0O68I [475427] 
gi|818704|gbJU24692.1|EFU24692 [818704] 
gqi55036(gb|M97297.1|TRNVAN [155036] 
gi|I50552|gb|M64978.11PCFPRGAB [150552] 
giJ786274}gb|U2254 1 . 1 JEHU2254 1 [786274] 
gi|786273|gb|U22540. 1 |EHU22540 (786273] 
gi|559858[gblL371 10.1(ADICLYL [559858] 
gi|6436 14|gb|U 1 6659. 1 |ECU 1 6659 [643614] 
gi{643612|gb|U16658.1|ECU16658 [643612] 
gi|29064 l|gb|L13292. HENECOPPUMP [29064 1 ] 
giI624701|gb|U9639.1|ENEVANCRF [624701] 
gi|624699|gb|L29638.1|ENEVANCR [624699] 
gi|624692|gb|U964l.l|ENEDDLA [624692] 
gil624690|gblU9640.1|ENEDDL [624690] 
gi)493094|gbfU2813.1|ENERRD [493094] 



gi| 1 53852|gb| AH000939. 1 JSEGJSTRTN9 1 6 
[153852] 

gi| 1 5385 1 |gb(M22645. 1 1STRTN9 1 62 ( 1 53 85 1] 
gi|153850|gb|M20864.1|STOTN9161 [153850] 
gif 1 53660|gb|M36878. 1 (STRIF2B A [ 1 53660] 
gi|153585|gb|M13771.l|STRBRP [153585] 
gi|153575|gb|M64265J|STRATPEFHA [153575] 
gi| 1 53565|gb|M90060. 1 |STRATPASEA [ 153565] 
gi| 1 52969tgb|M92376. 1 jSTABLAIA [ 1 52969] 
gi|309660|gblL14285.1|PCFPRGWZY [309660] 
gi|4337l4|gb|L12033.1|ENESATA [433714] 
gi|290645|gb[L15304.1|ENEVANB2A [290645] 
gij 14833 1 |gbJM84146.1|ENEVANR [14833 1] 
g:| 148329|gbjM64304. 1 (ENEVAMH [ 148329] 
gi|148326[gb[M689l0.1lENEVANCRES [148326] 
gi[148324|gbiM75132.1|ENEVANC [148324] 
gz|148323|gb(L06138.1|ENEVANB [148323] 
giI148321|gb[M85225.1|ENETETM [148321] 
gi|148320|gb|L00925.1|ENERTRNA [148320] 
gi|148319|gb|L0O924.1|ENERRNA [148319] 
gi|148317[gb|M81466.I|ENERECA [148317] 
gi|148315|gb|M81961.l|ENENAPA [148315] 
gi|148312|gb|M38386.1|ENEMSPDPS [148312] 
gi|1483I0|gb|M37I85.1(ENEGELE [148310] 
gi|148307|gblL07892.1|ENEBLACREG [148307] 
gi| 148305|gb|M60253. 1 (ENEBELAA [148305] 
gt| 148303|gb|M77639. 1 |ENEB 14NAM [148303] 
gi|290644|gb|L16515.1|ENERGTG [290644] 
gii 1 54954|gb|M37 1 84. 1 [TRN9 1 6 (154954] 
gi| 148301 |gb|M6922 1 . 1{ENEAAD9A [148301 ] 
gi|148308|gb|M38052.1|ENECYLB [1483081 
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Table 28 

Phage Dpi complete genome sequence. 56506 nucleotides. 

l acaacaaaaa tatgaagcag atattgggcc aattattgct caacaaaacg caccgaatce gtgtataata 

71 taagtgaagc agtttcgtaa acctgacatc ctgctaaata aaaataaagg aggcccgaac atgagtcaaa 

141 acactacacg cactgacgct gaactgacag gcgteactct tttaggaaac caagacacca aatacgatca 

211 tgactataac ccagacgccc ttgaaacttt ccccaacaaa catcctgaaa ataactacct agtaacattt 

281 gacggacatg aattcacttc cctttgccct aaaacaggac agcctgactt cgcgaaegtt ttcattagtt 

351 acattccaaa cgaaaagacg gttgaatcta aatcattgaa attgtactta ttcagcctcc gtaaccaegg 

421 tgactcccac gaagaccgca cgaacautat tttgaacgac ttgtatgaat tgatggaacc taagtacatt 

4 91 gaagceacgg gcccactcac tccccgtggt ggaattccaa tttacccatt cgccaacaaa gtgaatccec 

SSI aatttgcaac tcctgaactc gaacagcttc aacttcaacg caaattgaac ttccctggaa atgttcaagg 

631 tcttggacga gctattcgac aggaggctgg aatgaaatca gtagttttat catccggcgg agtcgactca 

701 gccacctgtc tagcaattga agctgacaag tggggttcca aaaatgttca cgctacagca ttcaattacg 

771 gacaaaagca tgaagcagaa ctcgaaaacg cegctaacgt tgcaatgttc tacggagtca agcccaccat 

641 cctegaaatc gactcgaaaa tccacccaag ctctagctct tccttattac aaggaaaagg cgaaacttca 

911 cacggaaaat cttacgctga aaccctagca gagaaggaag tagttgacac ctatgttcca tttagaaatg 

981 gactaatgctt ttcacaggct gcggctcacg cttactcggt tggagcttcc tacgccgtac atggtgctca 

1051 cgcagacgat gcggctggag gtgcctaccc tgattgcacc cctgagtcct acaattcaat gtcaaatgca 

1121 atggaatatg gaaccggagg caaggcaacc ctcgtcgccc ctctacctac tctaaccaag gcgcaagtcg 

1191 ttaaatgggg aaetgattta gacgttcctt atttcttgac tcgttcatgt tatgaaagtg acgctgaaag 

1261 ttgtggaact tgcgcaactt gtatcgaccg caaaaaggca ttcgaagaaa atggaatgac tgaccccatt 

1331 cattataagg agaattgata tgagagcttc taaaacctca acacccgacg cagctcatca accagttgga 

1401 catttcggaa aatgcgcaaa tttgcacggg catacttaca aagccgaaat ttcattagca ggcggaactt 

1471 atgaccacgg ttcgagtcaa gggatggteg ctgactttta tcacgeeaag aaaatcgcag gtacattcat 

1541 tgacagactt gaccacgctg ttcctcttca agggaatgaa ccaatcgctt tagcaaatgc agttgacacc 

1611 aagcgagttc tateeggatc tagaactacg gctgagaaca tgtcaagatt cctcacctgg actctcacgg 

1681 agcttatgtg gaagcacgct cgtatcgacc ctaccaaact atgggaaact cctacaggtt gcgcagaatg 

1751 tacttactac gagattctca cagaagacga gatcgaaacg ttcaagaacg taaccctcat cgacaaagac 

1821 gaa&agatca ctgtccgcga aattttagag caggagcagg ataatggtta atcaatacaa tcagcctgaa 

IS 91 agaggcaaga ttcgaatcaa tgtccgcgac cctgagaaaa tgcetatcat ggaaattttc ggtcctacaa 

1961 tccaaggtga aggaatggtt ataggtcaaa agactatttt cattcgaact ggtggacgcg accatcattg 

2031 caaetggtgt gactcagcct ttacctggaa cggtactact gagccggaat atatcacagg caaagaagct 

2101 gccagtcgaa tcttgaaact agcCtccaat gataaaggtg aacagatttg taaccacgcg acattgactg 

2171 gaggaaatcc tgccttaatc aacgagccta tggccaagat gatttcgacc ctaaaagaac atggattcaa 

2241 gettggtccc gaaactcaag gaactcgatc ccaagaacgg ttcaaagaag taagcgatae cactattagt 

2311 cctaaaecgc cttcaagtgg aatgagaact aatatgaaaa ttcttgaagc tattgtagat agaatgaatg 

2381 acgaaaacct tgactggtca tttaaaatcg ttatctttga cgaaaatgac ctagcttatg cgcgtgatat 

24 51 gtttaaaacc tecgaaggca agttacgtcc agtgaaccac ctttcagttg ggaatgcaaa cgcatacgaa 

2521 gaaggaaaaa tcagtgacag gcttcttgaa aagttgggat ggcettggga caaagtgtat gaagacccag 

2591 ccctcaacaa tgttcgacct ttaccgcaac ttcatacact cgtttatgat aataaaagag gagtataaaa 

2661 cgaaaattga gcatctagac aaaatcggta acgtattagg gagagagaac ggatgggctt ccctcaagcc 

2731 ggatgaaact gtaaccttgg acaacactga ggcagccgtt caaagacttt etggtctatt aggcgaggac 

2801 gcagaacgcg acgggttgca agacactcca ctccgtcctg ctaaagcact cgctgaacat accgcagggt 

2871 aecgagaaga ccetaaactt cacctcgaaa aaacattcga cgtcgaccat gaagaccttg etctcgtgaa 

2941 agacattcca ttcaattctt tatgtgagca tcatttagct ccgttcgtag ggaaggcgca tattgcatac 

3011 attcctaagg ataagactac aggtctctca aaatccggtc gagtggttga aggatacgcc aaacgactcc 

3081 aagtacaaga gcgettgact caacaaatcg cegacgctat tcaggaagtt ctaaatcctc aagcagctgc 

31S1 ggtcatcgta gaggctgagc atacttgcat gagcggacgc ggeattaaga agcacggggc aacgacagcg 

3221 acttcaacta tgcgaggtct cttccaagat gacgcatctg etcgagcaga actgcctcag ctgattaaaa 

3291 agtaggaggc ggaaaacgaa taaaagcgca acctcccggc ttgtccgaac agctctcatt gcggctctat 

3361 acgcgacatt gaccgttgca ttttctgcta ttagttaegg acctattcaa cttagagcca gtgaagcctt 

3431 gattcttcta cctttatgga accatagatg gactccgggg attgcattag gaacaactac cgcaaactcc 

3501 tttecacctc ttggactgat tgacgtceta tccggctcac ttgctacctt ccttggagta gcggcaatgg 

3571 tgaaagtcgc taagatggca agtcccctac attcacttat ctgtccagtt cttgctaacg ctcaccetac 

3641 tgcgccggaa ctccgaatag tttactcttt acctctttgg gaacctgtca tctatgtagg aattagtgaa 

3711 gcgattatcg ttttaatttc acacttcctt atttccacgc tggcgaagaa caaccatttc agaacactga 

3781 taggagcgaa aaacgggatt caacctatac cccgcaggag gtcaegctat tagcaccgac gattacttga 

3B51 aggaaagagg agccaatcgc ccattcaatc aactgtacga aagaaacggg attggcaaaa ggtggattga 

3921 gcacaagaaa accaatccaa gcactacttc aaaactattc gtcgactcta gtgcatattc tgctcatacc 

3991 aaaggggctg aagttgacat cgacgcccac accgaacacg tgaatgataa cgtgggaacg tttgactgta 

4061 tcgccgaacc cgataaaatc ectggtgtat ttagacagcc taagacacgt gaacagcccc tggaagcacc 

4131 acaaatttct tgggacaatc atctatacac gcgcgagcga atggttgaga aagacaagct cttacctatt 

4201 ttccatacgg gagaagactt caaatggctc aacttgacgc tcgaaactac attcgaaggc ggaaagcata 

4271 ttccttacat cggaattcca ccagccaatg actcgactac gaagcacaaa gacaagtgga tggaaagagt 

4341 atccgaagtt attcgaaaca getctaacce agacgctaag actcacgcat ttgggatgac agtcactagc 

4411 caatcagagc gtcacccatt ctatagcgcc gactceactc ctgtactgct cacaggagcg acgggaaaca 

4481 ctatgacgtc aaaaggatta gttgacttgt cacagaagaa tggaggaatt gatgctgccc gcaggctgcc 

4 551 aaaaccggtt caagtcgaaa ctgaacccat catcgaagaa actggagcgc actttagccc agagcaatta 

4621 gttgaggact ataaacttcg agcattgttc aatgttcaat acatgctgaa ttgggcagag aactacgaat 

4691 tcaagggaac eaaaaaccgt caacgtcgac tattttagat aagagcttct cgctcttact tttttcaaaa 
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4 761 aaaaatgaac ttcttataca aaaacgcttg 

4 831 aacgaataag aggcaaataa aatgacagca 

49Q1 actttccaaa agacgctgag tacagcgaca 

4971 tggcgctcat agagatgagc acgacacgaa 

5041 tgaaaaaagt ccaaacttac caagaacatc 

5111 ccgagaagga aaaataggag tcgatgaagc 

51B1 gaggaacetc ctttcattgt actcaaaatg 

5251 atatgctcaa aagattcaaa attacttaga 

5321 acatcaaaaa aaggaggccc atattatgag 

5391 cagctcaata agttgaagcc tagcaagccg 

5461 aatgcgtcat gtttacagcg tatgatggct 

5531 tgacgcgatt gcgaaagcag agcagtttgg 

5601 gttcccgaag aaccttcgct aaaagttatt 

5671 aagagtaccc tacattcgac cacttgcccg 

S741 gctgttctac ggaaccgcca ataccaacga 

5811 ggcttcctgt taaaaggcgg aaaagcaatt 

5881 aaaagggact agaaatgctc attccttaca 

5951 gtacttctgg caaaccgacg atactaccgt 

6021 atggaaggta tggaagatea cgaagacgtt 

6091 tccctacagc agaaatectg agcgtattag 

6161 cgtcgaattc ttattcttga aagaccgact 

6231 tacgcatctg ctggcaagaa agtttcgaag 

6301 aaactgcacc aaccgccacc gaagaaaact 

6371 accgaacggt gccgttcact ccctagcact 

6441 gaattgcaaa gatggttaga gcaggaaaca 

6511 ggttattgaa cgaactcagc ctgaatataa 

6581 attcgaaaaa tgtatttcga aagaaccggt 

6651 tgggcgaagc tggaacattt aggcacgaag 

6721 ggactttgaa tggttgaatg tagcagagtt 

6791 cgtttcaaga aaaacgatta cgaaacgaag 

6861 gactagttcg atacaaaggc aagctctaca 

6931 acatactgag ccctatgaag aacacaagat 

7001 gtcattttcc tttatgaaaa tcgagataac 

7071 tgaaaaatca agtccttgga aaaattatga 

7141 ccatcgctct ccagcctacc gcccatattg 

7211 tgrtcgagga agactttttc gaaggcgcaa 

7281 taccactaac ggattccgag gagttgcaaa 

7351 cctattgaac cgaaaaccac taaagaagct 

7421 agctaccacg cgcagatgga tgcaaattta 

7491 gattatatgg tatccaattt caagccttga 

7561 tccatcgatg cagggtatga agtttctcac 

7631 ttctagatge agttgagctt cattacaagg 

7701 ttcgagaaga agaaatacga gaegctcaag 

7771 cgacgaaact gttgaagcag cttgeggtec 

7841 caaaatcctg tcattatgga agaccttaac 

7911 cagatagggc ggaaatggtg ggaatacaaa 

79 Bl tctatacatt ttagccgccg ggaaaactat 

8051 gaaga&gtca tcgaaaatgc tcacaagcga 

8121 aggtactagc atccttaaaa cgaacccaaa 

8191 aaaaggagta ttattaaatg caaaaagacg 

8261 atacacaggt gattgggtcg acgcacgaac 

8331 tcaagatgtc gaaaagtgct tcaaaaggct 

8401 cacacggatt tgctcttgaa cttcctaagg 

8471 gaaaactggt etaatcttcg ttcctagcgg 

8541 ttetcagttt ggtatgctac ccgtgacgca 

8611 aggaaaagca acctgctatc aagttcaatt 

8681 aagtacaggt gatttctaat gaaattggaa 

8751 cagcagttca aggactcgaa cgcgaagcgc 

8821 aacctacggc gggctccctc gaaaaagggc 

8891 tcagctctcg acattgteaa gaatgcgcaa 

8961 tcaaggaaaa gctggaaaat gcgcgtgcat 

9031 acccgatagc cttcaagagc ctettaagat 

9101 gctaaaaaga ttggagtcga tgttgacaat 

9171 tacctcaata tgtcctagac actttcgaaa 

9241 catggtcagt caaaacctta ctgatgaaga 

9311 actgaattta gtcgaaaggt tactcctctt 

9361 ttcgagaaga tatgaatagt cagtacaatg 

9451 tgcagttcga cttaaattta gaaaaggtga 

9521 cgaaaccctg cagggaatgc agtagagtca 

9S91 tagtttccta tacgctttcc tatcatgatg 

9661 acttggagtc attcaaaagg caggggcatg 

9731 gatgaagacg aagaaccatt gaagttccaa 

9801 acttattcga catggtgatg accgcggttc 

9871 ctatttggac ctaagctagt gcctgctagt 

9941 aaatcgatga gcaagtggtt gagcttatga 

10011 ttattatttt aatgactcaa ctaeagcaga 
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actctatcca cccattatcg tacaatcaca atataaacaa 
gctcaacaag ttaagctcca cttzagaagaa gccggcgctc 
acccagagca agcaaccatg aaagataccc ctaaatggaa 
aacaacttca tacgaagcat tatagagagg ggtaaggcta 
caaaactagt tgagtteaaa cgtcaacctt cCttaaatct 
ggttattcaa ttattcacct tctatagttt caacaatatc 
caagaggctg ccgtgaacgg gacttatgaa gcaaaactca 
aacggcttta caaactcgcg ataattcgcg tatattatat 
tatcaagttc aaaaccgaag aactttcaaa aattgtttct 
ctagaaatca caaactattg gcatattttt ggtgacggcg 
caaacttcct tcgatgcatt atcgacagcg atgttgaaat 
aaaacttgca gaaaagacca cggccgcaac cgtcacatta 
gggaatggtg agtacaatat tgatattgtt acagaagatg 
aagacgtgag tgaagaaaac gctctcactt tgaaaagctc 
ttctgcggta tccaaatcag gagcagatgg aatttatacc 
actacagaca tcattcgcgt atgtatcaac cctatcaagg 
acccaatgag tattttagca agtattcctg atgagaagat 
ccatatttca tcggcttcag tcgaaattta tggaaaattg 
tcacagcttg acccaatcga gtttgaagat gatgeggcta 
accgccttgt actattcact tcagcctttg acaaaggaac 
tcgaattaaa acttctacta gcagttatga agacatcatg 
aaagaattca cttgccacct taacagctca ctcttgaagg 
tcactgtctc ctatggaagc gaaaccgcaa ttaagatttc 
tcaagagccg gaagaataat ggccaagtcc aatttaacta 
gtgaaggtcc tgcttcatct tttgtcaatt cgctgacccg 
tccttcgaca tattataagc ccagcggggt tggtggatgt 
gagtctatta tagataacgc agattctaac ctaattgcaa 
ttctccaaga gtacacggtt aaaatggctg aaatcgatga 
cttgaaagaa aatccagttg aaggaactat cgtcgacgag 
tgtaagaacg aacttcttca actttcattc ttgtgtgacg 
ttttagagac taagactgaa accatgttca agttcaccaa 
gcaagcaact tgctacggaa tgtgtctagg agtcgatgat 
ttcgaaaaga aagcctacac gtttcacatc acagacgaga 
cccgcgaaga gtatgtagag aaaggcgaaa gtcctaaaat 
cagaaaggas ggccgaaatc tgtgagctac actggaaaaa 
aagactttga gaaagaegct ttcacggtcc gtctacatga 
tccctgcgat tatatagccg caaccaactt tgggaccttg 
tctttgagct ttaataacat cactgataat caatggttcc 
ttctcgccgg aattttagtg tatttccaaa agcatgaaaa 
aaaaattaaa cggtctggag ttaaaagcgt caacccaaac 
aagaagcgtc gaactagatt gaccattcct ttccaaaatg 
agaaaagcaa tggcaagacc taagttacct caaattgata 
acgtagcaga ctcgtatggt gcgattatca ataaagtagt 
acttgaccag gcaatggaag aaattcaaac agttgtaagc 
tactacattg gctatcttcc caetcttett tattttegccg 
tggattcaag ttctgctatc aggaaagaas aatacgataa 
ccctgacaag caagcagaaa ctcgaaaact tgtcatgaat 
gcctacaaga aagttcaatt aaagctagaa caggccgata 
cctggcaact agcagagtta gaaactcagt caaataattc 
tagacgtgaa aatgattgac cctaaacttg accgattaaa 
tagttctatc actaaaattg acgccgacag cgccgatgtc 
caagtatatt cagtggcggc aggtgaatgc attaaaattg 
gatatgaagc aatcttgcat cctcgttcca gtctttttaa 
agtgattgac gaaggttaca aaggtgacac tgatgaatgg 
gatatcttct acgaecaaag aattgcccaa tttagaattc 
tcgtagaatc tttaggaaat gcggctcgtg gaggccatgg 
cagttgatga aggactggaa taaggattcg aaagctcttg 
ttccaagaat ccctttttct gcgccttcta tgaattatca 
agttgaatcc ttcggtcctg agtcaagtgg gaaaactact 
atggtatttg agcaggaatg ggaacagaag actgaagaac 
ccaaagctag caagactgct gtcaaggaac ttgaaatgca 
tgtatatctt gaccttgaga atacattaga cactgagtgg 
atttggatag ttcgccctga aatgaacagc gctgaagaaa 
caggtgaagt tggcctagta gttctagact ccttgcctta 
gttgactaaa aaggcctatg caggaatctc agcgcctttg 
cttactcgct acaatgcaat attcctaggc atcaatcaaa 
cctattcaac tccaggcgga aagatgtgga agcatgcttg 
ctaccttgac gaaaacggtg catcattgac ccgtactgcc 
ttcgtcgaga agaccaaagc atttaagccg gacagaaaac 
gaattcaaat tgaaaatgac cttgtagatg tcgctgfccga 
gttcagtatc gtcgaccttg aaactggaga aattatgaca 
ggcaaggcaa atctagttcg acgcttcaag gaggatgact 
acgaaattac cactcgagaa gaaggctaat gcaaaaacct 
tcaaggcgca agaaaagaac ggttccaaaa cctaaaccta 
accgcagaga gcgtcaagtg cttgtteata gttgcatcta 
cgggcagcat gacaaatgga gccacgaact atattctctt 
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10081 acagtctcgc accctgatga gtttcgacag 

101S1 ctggaatggg tcttccatac gaccgncagt 

10221 ttagcttcta aataccgtcc tcaaactccc 

10291 cgaatcaact acaaaatggc gctaccaaac 

10361 caccactgct cgaattttcg cgaaggatgt 

10431 tccaacaacg gggcagaaaa tgttcgaaac 

10501 tcaaagttta caccattgac gaggttcata 

10S71 agaagagccc ecaccgggaa ccgcgetcat 

10641 ctcagtcgag ttcaacggtt tgactttact 

10711 ttatcgaaag tgaaaatgaa gaaggagccg 

10781 acttgcaaat ggaggaatgc gtgacagtat 

10851 gacatggaag ccgtttctaa tgcactagga 

10921 ctgccaacta tgacggccca aagcgcttag 

10991 attagtgact cgaaacttta cagacttcct 

11061 atcactcaac ctcctgccca ttttgaaagt 

11131 tattgtggat gctagaagaa atgaatgaac 

11201 aattgaaacc aaacttcett tgaegagcaa 

11271 cattccgaaa tggaaaeaac ttccaaaata 

11341 ttaacccgct atatcgcttc gaaatttgac 

11411 gaaacatcat tcaggatgca cagactattt 

11481 aatgtcagct cttaactcge ttttgaagat 

11SS1 gttgatagca ccaataatgc tttacctacg 

11621 ctaacgaaga gaaaatgcag tctgtcaagt 

11691 gattgtagac tattgcaatc ttgccagcaa 

11761 gagctatttg aaaaggttac aacactttat 

11831 ttactaattg gctcaaattt aaggaaactg 

11901 tccaaategg tcgacagttg tcatcaggaa 

11971 gaccttttag tgagggaagc atctaggtgt 

12041 gcgtgaacga atccatcagg agggtcaaac 

12111 ccaataacct aaagccgttc tatatcttgt 

12181 aacgggaaat gtagttcgag aaacttcggt 

122S1 tccaatcatc gaacattcgc tgttegagat 

12321 ctccggatgt tagatatggg acacttgtct 

12391 ggcctttcct gata&ttgtg ttgagtttga 

12461 aaatactcga ctattgafcag cgacatgatt 

12S31 ttgacaatga attggacaag ctgtcgcgat 

12601 gcacaagacc gaaaccgaca ttttcagcct 

12671 atgaaagtga ctgaactctt agccaaagga 

12741 ctaataacgc ctgtctcgtg ctaggagccg 

12811 aatcaataag attgtctata actttcaata 

12861 caagccatcg agggcacaaa gaacggtcgc 

129S1 cettttcact tacttaacaa ataagctgaa 

13031 tgacagaagt tgcggtaaat agcccgcaaa 

13091 atacttaaaa aggaagcacg gaacagaaac 

13161 tgacagactt taaaaaacgc ttcaagaaag 

13231 tatggattgg cccgaaaatg ataecaattt 

13301 gaaggtggac ttgtcgagca ctcattaaac 

13371 gcaaaggctg ggaag&cacc tacccaatgg 

13441 agttggecag taccgcgaaa ccgaaaaatg 

13511 catgaatacg accctgagca acccacaatg 

13581 tccaactcac gccagttgaa gctcaagcaa 

13651 aaatttgaat ggatgtggag cagccttcga 

13721 gccgcaacct atgtagtcga aaaegaaaac 

13791 ttgaagaagt agccgaagaa aaacctaaga 

13861 tgaagaggct gaagaaaaac caaaagctgg 

13931 gtagaagagc ctaaagaaga gcctaagaaa 

14001 tcgaagaggt agaaagcgca gacgagccga 

14071 tggacacgtt cgagatgtct actacttcca 

14141 gagcctgacg atgacagcga cattcttgta 

14211 aagaagactc cttccacgaa cctgacggca 

14281 atacgacgaa gaaacttggg aacctaccac 

14351 gttgcaaaac ctactcgaaa aactccagcg 

14421 aaatgtgtga aaattgtcaa aacgaaacat 

14491 cgacgcctca ttcacttaca aggagacccg 

14561 aaagaccgtg acagcctttt agtcgctaea 

14631 agagactttg tattgcaaat tctcgaetgg 

14701 aaaggctgaa gatttaaagg acgttatctt 

14771 ttgcaactag tcgaatcagg agcaccataa 

14841 cggcactcat ttagaagtag cagccttgtt 

14911 caggggtata ttgatacccc tgacctttat 

14981 tcaatggcga cgctattgct actgacatct 

1S051 tgaaactatc aaacacgagg agcctattgc 

15121 aacttcgcga agactatcaa cgtgcaagag 

15191 agaactcgaa aaccttgaag cctttgtggg 

15261 cgaaatgcct cgaggctacg tgt'attagac 

15331 actatacatg ggttcaccaa ctccgagaca 
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actgttctct ataacgagct taaacagtct gacggaaata 
ttgctgtaag ggtcgcagaa aggctcttaa gaaaatgaac 
gaggaagcgg cagcccaaga atatgtcaaa gaaactcttt 
acggccaccc atcccgcggc ggcgctggaa ctggcaaaac 
gaacaaagga cttggctctc ctactgaaat tgatgctgct 
actactgaag attctagata caagtctatg gacagcgagt 
tgctctcaac cggagcattc aatgcgccgt tgaaaacatt 
tctaegtacc actgaccctc aaaagactcc cgacaccatt 
cgaattgata atgacgacat cgttaaccaa cttcaaetta 
gtcatagtta tgagcgcgac gccctttcgc ctattgggaa 
cacaaggctc gaaaaagtcc ttgattatag ccatcacgtt 
gctccggact acgaaacaCt cgcttcactt gttgaagcta 
aaactgtaaa tgacttccac tactcaggaa aagacctgaa 
ttcagaggtt tgtaagtace ggctagttcg agatacttca 
aagctagagc aattcegtga ggccctccaa eatcctactc 
tcgctggagt tgttaaatgg gagcctaatg etaaaccgat 
ggaggagcga catgactgga cagggacttg ttaaaectac 
tataatcgtc gaaggcgaag taggttcagg acggaagacc 
gctgattcta ttgtageagg aacgagtgta gacgacactc 
tcaaggcgag aatctacgtg atagacggaa acagcctgtc 
agcggaagag ccacctttaa actgtcatae agccatgacc 
ctcgcaagca gagcaaaagc tceaaccatg ctaccetata 
cctacaagaa ggtagacact tcaggaactg acgaccgagc 
ccctcaaacg cttgaagaca tattagaata tggcgcagaa 
gacttaatat gggaggcaag cgctagcaat tcgctaaagg 
acgaaggaaa aattgagcct aaacteeecc tcaactgtct 
gcactatgta gaaatgtctt ccgaagaact tgaggcccat 
ttgcgaaagg tatccaaaaa gggctcaaat gcgegtgtct 
aagttgagtg atttagtatc atttcaaaaa gacattcgaa 
acggcgaaga aattggtctt atgaacgtct atctcaatea 
ttcaacagtc cggaaaaccc tcactcaaaa agggctcgtt 
gacaaggagc ttctgtctaa tgagtcgagg tggaaaaggc 
tgatggttac taaaattgac aagcgaagca agttgccaaa 
gaaaatgact gacgcgcagt tgaaaaggca ttttgtgtct 
gacatggcca tccagttctg tctaaacgat tactctagaa 
tgaaaaaggt tgacgcatca gtagtcgaat ccattgtcaa 
agtcgatgat gtattggaat ataggccgga gcaggcaatt 
gaaagtccta ttggattgct taccttgctt tatcaaaatt 
atgagcctaa agaagccaat ctaggcatta agcagttctt 
cgagctggac tcagcctctg aaggcatggc tattctaggt 
catacagaaa gctcagtggt ctacattcct tcgcataaaa 
acccgtgtat attacagtat aagcaaagga ggacagccta 
aggtgagagt agttatggtc gggaatatcg aatttctcga 
ttccatcagt catattatag aaaatgaaag gggtctaaca 
cagtaacaga aaeaatcaat cgtgacggca tcgagaacct 
cttctcaagt ccagcaagca cecgatacca tggaagctat 
gtgttcaatc aactactttt cgaaatggat accatggtag 
aaacagttgc aatcgtagca ccatttcacg acccttgcaa 
gcgcaagaac agcgacggtg aatgggaaag ccatttagca 
ggacatggtg caaaatctaa cttcctcctt caacgttCca 
ttttetggca tacgggagcc tatgatatta gtccttaegc 
aactaatcca cttgcattct taatccatcg cgcagatacg 
ttcgaatacc ctcaaggtcc agttgaacaa gaggctgagg 
gttcaactcg caagaaacct gcgcctaagg aagaaaaagc 
aatcactcga cgtcgcaaac ccgcgccaaa agaggaagag 
gcatcttcta aaactcgaat gcctaaaaag actgaaaagg 
aagccgaaga agcagaggac gacaatg&gg cggtacctgc 
cagtgaagtc gccgacgtct actacaagaa agatgccgac 
gacgaagaag ageacatgga cgcaacgtgc eccgtattag 
aggttcacaa attagcaaaa ggtgaacgcc tgccggaaga 
tgaagcag&a tacaccaagc gaacagaaaa acctaaagca 
ccttctcgcc gccctcgcce tcaaaagaaa ggtcgaaata 
tcaatactag aattttcaat gaagatgaaa gtggctatgt 
cgacaccgca gcagctatta gcaaccgagc ggtagaaaag 
gtcatggctc ttcccgtttc tcacgcagaa gatttaggca 
aagcatttcg tgaagctgtt caagaggcce tcgagaatga 
aggtcctatc gacgccgaca aaaaaaccgg caaccttgca 
cggaacgaat aaagacgcta ctccacgcga tttatgctaa . . - - 
cgataccgct gacgatcatg acgacgttat agaggacati: ** 
aatcaaagga gcatcagaat ggcgcctcac aatcctgaca 
tactacgact agatgatact atctacgtcg acgcaacttg 
acgaaeaacc agcgaaagca aacgaacaaa egaatcgtcg 
gtcgaataaa cttcctcccc gccgcaaagg accacggcga 
atacattgac aatecagtcg aatgttttcc tgaaagccaa 
gaccctccag tcaccaatgc ggccgctgaa actggacacc 
aagcagttga aaeacttgaa gaaatcccag atggggacaa 
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15401 cattaetcgc tccaaacacg gaatcgaaat 

15471 agttagCgcc cttgtactgt cagccttctg 

15541 gaggaccacc gtagcaccgt cgcccttgta 

1S611 ectctatcct cgcccacccg ccacgacatc 

15S81 aaacaagaga aggcagetgc caagcagctg 

15751 acaaaggtga cgccgcaaca gacccaatgc 

15821 cagcttgaaa aaggaatggt tcccaaaaaa 

15891 atcgctttcg actttggtga cggaggcgaa 

15961 tagaggatag aaatgataac ctcatttaaa 

16031 ccatgcaact gtacgcagac cttattccta 

16101 tgacectatt gctcgagaaa acgcacccga 

16171 acaaacctcg accagaatga cgtegacg&t 

1S241 actacctaac caagctacaa agtcaacaaa 

16311 ttgaeaaate ccageaaect gacgagcgca 

16381 ccttgctcat ttcttgcttt aattctttcg 

16451 caattctagc atcaacctcc acgccgcgag 

16521 tactgcaatg tcaagttcgc tcttectaat 

16591 gtgaccttat actgtttcrc agttccttte 

16661 cgrccttcca atctgccgta agacaaccga 

16731 ttgacgcttg ttttatttat attatgatta 

16801 aagttgaact tttttaaata tttttttteg 

16871 aaaattaagt ccatcttcat aagcaagaac 

16941 attccgtgga ctcctttttt aagtccgccg 

17011 caatcctttc gagtcgcttt tcattttgcg 

17081 aatagtttga atggcttcaa aaaagtccgt 

17151 caggaaagca aagcgttcca gctagtgatt 

17221 tcagaatate tttgtagtca ataecagcct 

17291 ccaaatctcc gccctcgtca tcgttttcat 

17361 gtttcgaact cgaatgctaa ggacttccat 

17431 tcccactcta aatcgtcgta gtcgaagata 

17501 ttgccatttt agtttcctcc ttacgcgaca 

17571 tgaacttaac etggtcgacc gtttcttcca 

17641 ttggccgttt ccgtcgataa cttcgtacca 

17711 ttcattttac tacctccact tttccgtcca 

17781 aagacgttct aggcttaccc atttacgacc 

17851 acaactttca Ctcctacttg caaatcttca 

17921 tatagtatta ttatacgata atgagtgaat 

17991 tttttttttc aaaaaaataa cgagccgaag 

18061 ectcatagcc tttacgacgt gctacctttc 

18131 tactttaaag tcatccgcct tggcacagtc 

18201 ttggaaaact cacctatatt agcacaacgc 

18271 ctaaaaagtt gtccaaggtt acaggaaggc 

18341 aaggctgaca atttcactgt ccttaaatag 

18411 ggetctgctc cgctacctag tacatcgcca 

18481 gggcgcccgc acgcgcaacc tggagcccct 

18551 aattccttca aaatagctct tgtccgggtc 

18621 aggccgaaat ataetcgaat ttcatctgta 

18691 ctcctacact tacttttctc gagagattcg 

18761 ctttttgttc tttgccatgc tagcacctcc 

18831 tgcttctcgc gatgcaatag tctcgagaat 

18901 ccagttatgg tggcgtcaat taagcaacca 

18971 atactagcct tttataacag ccacttcctg 

19041 ttgtagacga taaggagttc ctggaacttc 

19111 catgagtttt gaaaacggac aaccttccat 

19181 caatccataa ttgaaaaggc ttatcttctc 

19251 tgaaagcgcg attaggtcat ctaggctgtc 

19321 caaaagtaag cgacatttcc aacectctct 

19391 atagtcgcgc agaataaact tcgaatttca 

19461 ttgcattctc gccacgaaac cgcccttcaa 

19531 tccttccttc tttaaatttc gaaatgtgcc 

19601 agtgaatatt cttccacctg ctttttaaat 

19671 tcttgtagga aggttcgcga gtaggaagcc 

19741 tattttagac actaattcag cgccttgttt 

19811 tcacgctgat taatacaaaa gcacctaaaa 

19881 atattegacc tgcttctttc ccaacagccc 

19951 ttgagcaagt gcgatattat tctctagcat 

20021 tcactcgggt tgtcatttgc taattgaata 

20091 tagtcacttt ctatcatatt ttcgagcttt 

20161 gcccaagcgc gacaagtgtc gaaacgaaat 

20231 tacatttttc aatatctact tcaagttcga 

20301 tagagccttt tcataacctt ctgccaggta 

20371 ttaccaagat tatcaaaatc agcggcgtga 

20441 tacaeaaata gaagcagtct eaecccccaa 

20511 tttaaaactg tcgcttcagc tacaacatta 

20581 cgcctaagac ttcagcttgg tcatcgctca 

20651 cgaaaacttc accttatctc ccecttaett 
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taaggagaaa ctcgatgaat tacatggtaa aagtcattct 
catgactcgc tcaatggtcc atttggtcac aggtaagcaa 
cttggcgctc ccgcaagctc tgcggcgttc tattcgacac 
acgcgcacac aaaccaattc ccacgcgcag agctagtgct 
ggaggaaaag eacagcctaa tccaggagcc accgactact 
ttatagaatg caagacagtt acgaagccac aaagttcagt 
tgaacaggaa aggttcgctc aaaaactcga ctattctgct 
cagtacacag caacgtccac aagtcagctc aagcgaacat 
ataaacagtg aaggaacagt tactccaatt aaagggtcag 
tacaagagga cgatatacag ttcgttgaca taactggact 
gctcatttca cggagcegtg caggagcccc aaaacacggc 
ttcctacagc acgccaaaga agaagcgctc gactttgcta 
agcaaaataa atagacctat ttctaggtct atttttatta 
atcctctagc gcagacacta ggtggcggct ttcttgttta 
ttaaggcgtt cgattcttgt agctaattcc ttgatgattt 
taagtgcgac tccagtttca gcgacaggac atgccttgaa 
aactgagccc aggtctaagt acaagttagg attgacccca 
acaggaacgc tctcatagtg gaaagtgcag ttcttgtgac 
aataaagtgt tgtttccata attgacctct ttctgcgtcc 
tacgacaata aaggaacaaa gteaagcact ttttacaaaa 
aaaataaaaa gccccaataa tagagctctt agettagcag 
ccgcccgtac cggtaagaaa tagctgattc aatatccggc 
acagtacagt tacaatgacc tattcttgac tgaagttcct 
tatcaaetgt tttcgagtct aggtgagtga aggaacttgc 
tatcgaaact cctttacaag aaagcccatt ccgtgtatag 
cgaatttgag ggttaggaga gtttcgataa gccacaaaat 
cagtacgact gctgacaaac accttcattt tataaccctt 
agcaggcgac aacttcaacc cactcgtcgt cctcaccttc 
gtcctcaaca tcttcgaatc ctccattagg tgcatatcct 
gttacaagac gtccgtcaaa ttttactgtt tcctttactg 
tatagcttga taatttgaga ttcgatgtca ccatagttga 
cgtattegcc cacgtcctcg attcttccgt cttgaatcat 
ccattcatca ccgaattgtt tgattgcttc tttaactgtt 
ccagtgattc gttatcatag aaccgaatac gtccatcact 
ttgacggtca gteactttaa attcagtacc ttttgcattt 
acttttacca tcttatatga ctcctttatt tgtttttctt 
aaagtcaagt gtttttgtaa acttcttcaa attttttaat 
ctacgttatt tatttatctg ctcaagggct tgttgaattg 
cagctctaga gccgggtgaa aagtcccaaa cagtttcgtc 
gagcaggagc tggacagcct tttgccattt ccgccaattc 
aaaacaagtg ctctagtatg ctggctagac acaatgaacc 
cctccggaaa ctcataaggc tctttgacat cgcatccgaa 
cccaccgcct ttacacataa tacettgaac aacttcagta 
accgtgtgac aataggcttt aagaactgca aaaaaacctg 
taacagtcac ccaaggctga ggtttcttac aaacaaccct 
aatagtgcct aacattgtca gcctgttctt atttatataa 
tcaggcagcc actcaacagt gacttttcta caagcgattg 
cagggacaag catttccete etgacattta ctttttttcg 
acttccgtcg gccttgctte ttagctctgt tcagtccagc 
atgcctgttc acaggctcac aatattccgc caaagatttg 
tctaccgact ccttaccata aaatacaaaa tcgtcttggc 
cgcgtgtttc aattttaact aagctcattt tcacccaaac 
gaacaggagc ctcctttttt catcgtctac ttgtttaata 
ttattcccca tagtetcacc ttattccatg tacccgtcaa 
cataaggccg tgataatttt agtccagttc ccactacatt 
tagctcgagt tcgactacaa ggttgccagt atcaatttca 
agtgcttcac gatacctatc atatgtcgcc tcttcgtcaa 
ttttagtcac cgccttccaa aatttcatcg ggcacaatct 
tacacgcttc aagactgaag tcatgttgag gtctgtcaat 
ccgaagcgca ttttttgttt gctcgctagg taggaccata 
cgaatggcta aggctgacaa aaagcctttg aggtatgaat 
ggtcaacacg gcaacgaaga taaagcaaag cagcctcata 
ttcgccgaag aaaactattc gacttttatt caagcgcata 
tcagccgcga gaatatgacc aagttcacgt tcccaccaaa 
gagaagtctc gaactgttta ggttcatcaa attgttcaac 
caactttcga gccataagaa gggcagtttg cccctcttcg 
agatttttaa ccttttcaat aattttttcg ttattcatat 
cgaaaagtca atgtcgtcta cttcaattgt cttgtcataa 
aggctacaaa acaccttttc attatggtcg aaactttcag. 
gaacgacaac agtatcaaca Cttcgaagcg ataaaaaggc 
aacaactcca gctgaaggcc ccaatccttc agctagaatt 
taaagtttca ctagttactt ccttacatat ctagagtcac 
gccctactca atagcttccc ctccgctgag cttttcgagt 
gcaaagctcg aaccgttgag aatgttttcg atatttcctg 
ccaccactag gtattcatca gtaagtgctt tagcaaagtc 
gtccctcccc acaccattac tatacaacaa tgatcgaata 
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20721 aagcaaagca tcttttataa aaaagtcgaa cettttteac aattttttga accacctaaa aattataaaa 

20791 tgggcggaaa acctaggcga caatttatac ccactttcaa cctcatttat aaacaatcta atacagaaaa 

20861 ggacttaaca agtaaataaa aaagcgccct gaaaatacct acaaacccca tagtccgtaa gcaaaaacaa 

20931 aaattagggg cgacataaaa gtcgagcact atcttaatct atcaccagcc tcatatacaa tcgacacaga 

21001 tttagcaggc tcctagcaaa ctttcgaaca gcacgaaaaa gcatacaact agaggaacag attatagaaa 

21071 aagcactccc acaaacaagc tetcaaaatg ccctcaaaaa ccgtaaaatt agtaagtctg aacttctcga 

21141 actcctaaac ttetcgaata accgagceta atttagaggt cgaaaaactc aatttcccga aaagtcgaac 

21211 ccgctcgaaa acctcaaaac actcgaaaag tcgagcatag aaaggggccg. aaaagccgag aacgctcgaa 

21281 aaacccaacc ggcccgaaaa cctcaaccct tcgaaaagtc gaaccattcg aaaagtccaa aagttcgaaa 

21351 aacccaacca ctcgagagta ggaattaagg acacaccagc tcaacctttt tagcttcaaa accacccctt 

21421 ttctcatcat aggactataa attcagtcaa tcgtaagtca cgcgcaaatt tgttacaatg caaacgataa 

21491 aatataaagg agggtcaata aacggcgaaa gctaccggac caaaagttcg aagaggaaaa actcctccac 

21561 ggccaaaaga caaaaaagga atcaaagcaa atgcgcgtgt caaeaaagac cagctcgtag agtatgacta 

21631 taaaggcanc aagacgacaa ttaaggaacg tgatgctaga atgaaactgg aatttatcag aggcatgact 

21701 acccaggaaa ttgcagcccg ctatggacta aatgaaaagc gcgttggcga aatacgggct cgcgataaat 

21771 gggtgaaggc taagaaagag ttcgagaatg aaaaggctcc tgtcactaat gacacaccga ctcaaatgta 

21841 cgcagggtct aaagtctcag ccaatattaa atatcacgcc gcctgggaga aactaacgaa catcgtcgaa 

21911 acgtgtctag ataatcctga cagacattta cttactaaag aaggaaatat tagatggggc gcattagatg 

21981 ccctttcgaa ccttatagat agagctcaaa aaggacaaga aagagcgaac ggaatgcctc cggaagaggt 

22051 ccgatataga ctacaaattg agcgcgagaa aatcacattg ctccgggcca aaatgggcga ccaggaaatt 

22121 gaaggcgagg tcaaagataa cttcgtagaa gcactagaca aagcagctca agccgtttgg caagaattta 

22191 gcgacgcaac aggttcctac attaaaggag cgactgataa tgacaataag cctgagaaat aaactaccta 

22261 agtccaactt cgeccctttt agtaagaaac aactccagce cccaacatgg cggacaaagg gctcaccttt 

22331 tcgaactctc gatatcgtca tagcagacgg ttccattcgc tcaggaaaaa cagtatcgat ggctctttca 

22401 ttttcccttt gggccatgac ggaattcaac ggacaaaact ttgccatctg tggtaagaca attcactcag 

22471 ctcgacgaaa egteaetcag cctctaaagc aaatgctcac aagtcgcggg tatgaaactc gagatgtccg 

22541 aaatgaaaat ctacccacta tcagacactt tagaaatggc gaagaaattg tcaaccactt ctatatattt 

22611 ggaggaaaag atgagtcgag tcaagacctt atacaggggg taacattagc aggtatcttc cgtgatgagg 

22681 eggcactgat gcctgaatcg tttgtcaacc aagcgacagg gcgctgtccc gtaacaggtt cgaaaatgtg 

22751 gttctcttgc aacccggcca atcctaatca ctacttcaag aagaactgga ttgacaaaca ggtcgaaaag 

22821 cgtatcctat atcttcactt tacaaeggac gacaacccca gcttgacgga tagcattaaa aggcgctatg 

22891 agaaaatgta tgctggagtc ttcaggaaaa gatttattct cggccettgg gtaacagcag atggtctagt 

22961 tcattcaatg ctcaatgaag agcagcatgt caaaaagctc aatatagaat ccgaccgttt attcgtagca 

23031 ggcgactttg gcatctataa tgcaacaacc ttcggccttt aeggattctc gaaacgtcat aagcgctacc 

23101 acctaattga gtcatactac cacccagggc gcgaggcgga agagcaacta actgaggcgg atgctaattc 

23171 gaatattcaa tttagttcag ttctacaaaa gactaccaaa gagtacgcaa atgattcagt cgatatgaca 

23241 cgaggaaagc aaatcgaata tataactctc gacccgtctg ettctgctat gattgttgaa cttcaaaagc 

23311 atccttatat agctagaaag aatatcccta tcattcctgc tcgaaatgac gtgacgcttg gcatttcatt 

23381 ccacgctgaa cccttggctg agaacagatc tacactcgac cccagcaaca cgcacgacat tgatgaatac 

234 51 catgcctaca gctgggacag taaagcgagc caaacgggag aagatagagt cattaaagag catgaccact 

23521 gcacggatag gaacagatat gcctgtctca ctgacgctct aatcaacgat gacttcggtt tcgaaataca 

23591 aatattaccc ggaaaaggcg ctagaaacta actaaacact tttatagaaa ttagtgtata atataagtag 

23661 gaggatttta aacatggcta aaaaaccaaa agctatctca cacacagacg aactgattag tcagtcgttt 

23731 gacagcccct tggcaaagaa ccaaaagttc aagaaagagc ttcaggaagt tgaaaagtat tatcaatact 

23801 tcgacggatt tgatgtcacg gacccgaata ctgactatgg gcaaacatgg aagattgacg aagactcagt 

23 871 cgactacaaa cctactcgag aaactcgaaa ctacatccga caacttatca aaaagcaatc acgctttatg 
23941 atgggcaaag agccagagct tatctttage ccagttcaag acaatcaaga tgaacaggct gagaacaagc 
24011 gcactctatt cgactctatt ttaaggaact gtaaaetctg gagcaaaagt acaaatgcac tagtcgacgc 

24 081 cacagcaggt aagcgggtat cgacgacagt agtagcaaat gccgctcaac aaattgacgt ccagctttat 
24151 tcaatgcctc agttcaccta cacagttgac cctagaaacc cttccagctt gctctctgcc gacatcgttt 
24 221 accaggacga gcgtacaaaa ggaacgagca ctgaaaaaca actttggcat cattatagat acgaaatgaa 
24 291 agctggaaca agtcaatcag gaactgcaac agctttagaa gacactgaag aacaacgttg gctcacttat 
24361 gccctaacgg atggagagtc gaaccaaatc tatatgacag aaagtggcca aactactatc aaggagacag 
24431 aggccaaact tgtagaaatt gaagacaacc taggaaacaa gattgaagct ccrtcaaaag ttcaagaacc 
24501 cgccccaacc ggcttgaagc aaattccttg tcgagtcatt cctaatgaae cattgactaa tgacatatac 
24 571 gggacaagcg acgccaaaga ccccatcaca gtagcagata acctgaacaa aactactagt gacttacgag 
24641 attcacctcg acttaaaatg ttcgagcagc ctgttaccat tgatggctct tctaagtcaa ttcaaggaat 
24 711 gaagattgcg ccaaacgctt tggtcgacct taagagtgac cctactecct caatcggcgg taccggaggc 
24781 aagcaagccc aagecactcc cacttcagga aacttcaacc tcctcccagc ggctgaatat tatttagagg 
24851 gcgctaagaa agccatgtat gaactaatgg accagccaat gcctgaaaag gtacaggagg cgccaccagg 
24921 aactgcaatg cagctcttat cccacgacct aacttctcga tgtgacggaa aacggattga gtgggatgat 
24991 gctattcaac ggctcattca aatgctggaa gaaatttcag caacagcgaa tgttgacttg ggaaatattc 
25061 ctcaagatac tcaatcaagt tatcaaacac ttacgacaat gactatcgaa caccactatc caattcctag 
25131 cgatgaactt tctgctaagc aacttgcgct cactgaagtt caaaccaatg tacgcagcca ccaatcttac 
25201 actgaagaac tcagcaagaa ggaaaaggcg gacaaggaat gggaacgcac tttggaagaa cttgctcagc 
25271 ttgacgaaac ctcagctgga gcactgcctg tattagcaaa cgaaceaaac gaacaagagg agcctcaaga 
25341 tgaaacgagt gaagaagacg aagctgatga caaagaaaaa gaacaaactg aacaaccaac cgaagaagga 
25411 gccgacccag acgttcaagg ttaattgega ccactgtgag cataagttcg accctacacc taaacagatt 
25481 atttcgaaac acatcgaaaa gggcgtagag tggagattct ccgaatgtcc taagtgccat tatcggcfcca 
25551 ccacttatgt aggaaacaag gaaatcgaaa accttattcg acttagaaat acttgccgag ctaaaatgaa 
25621 gcaggaactt caaaaaggag ctgctgctaa tcaaaacact taccatccat accgaattca ggatgagcaa 
25691 gctgggcata aaatctcagg gctcatggcg aagctaaaga aggagataaa caccgaaaaa cgagaaaaag 
25761 aatgggtacc tacacagctg ggaaaaggct atccacgaaa ataatactcg cctaacccct gaacaggaac 
25831 aagctgtact gaaagccttc agcgatgcag gaactgattt aatcgcaaag attaaaaagc ctcgaaatgg 
25901 atacttgcct aaaagaaect ataaagacta cgcttacgac ctgcacgccg ttcttgttca actaatgact 
25971 gaacactctc acaaggcggc aatgaacgca gcagatggcc aggcagttca tacectacaa gtattagcag 
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26041 aagacggaaa tgctacggcc gaaaagctcg 

26111 agcagccgag gcagttgtca aaggcgaaac 

261B1 tcagccgcac gcgcaggaaa tgatgtccaa 

26251 cagacatggc taaaatgctc gagaaacaca 

26321 agctgagaag ccagggaaac ccgctgccca 

26391 cgaaccacca tcagccattc cgccacagct 

26461 aageccaacg gcactctgtt cacgccccag 

26531 atttcctatc gaagaatgtc ctttcgacca 

26601 tcactcgaag aaaccgctga tgagttgaga 

26671 ggcacgacga ttcaagttca ggaaaagctg 

26741 tcggtteaac accgagtctc tttgtccata 

26811 atattcagtt atgtCataat acaagtCgaa 

26881 tgttccaatt aaacaaaaac agcagattca 

26951 tcaactagaa gacttgttaa aaggtccaga 

27021 acttcgaaag aactcgatgc taaaattttc 

27091 tcgatgaagt tgttcaacag cgcgatgcag 

27161 gctttctaaa caggtcaaag acaacggtga 

27231 aagcagtctc aacttgcaaa aggegcegcg 

27301 ctccagcagc agacattctt ggatttacga 

27371 aggtcttgat gaagagttga aagctgttcg 

27441 gcagaacaag aggctcaagc taagccgcca 

27511 gtggtgttcc cgaaccccgc gaaatcggct 

27581 agcacaagaa caatcatcat tctttaaata 

27651 actgatttta atcaaaccac tcgaagcact 

27721 ttccagctac cgcagcaact caagtaggga 

27791 tactacattt gaaggacgca aaaccggact 

27861 ttcgccgacc aagaagtgtt tgaaggtgaa 

27931 aatatgcagc cctccgaaaa gttggcgaeg 

28001 acaggaggaa ttatagacga atatttatga 

28071 cttccttcaa acgctcctca ataccttgga 

28141 tttcatggct caagggtgca aacaattcgc 

28211 tcttcgtgaa cgcgecggat ttagcaaaca 

28281 ggtgaaaaag accgtcaaaa cttgcaaatg 

2B3S1 ctcaactcta taatgatact aagaaccttg 

28421 attgcctcaa tacggtaaat tcactgtcaa 

28491 atggatgcta agcaacaata tgcagccact 

28561 acattttagc agcaatggat gacaecgaaa 

28631 aaacacttat aaccaaatga ctaagagtga 

28701 tgggaaaace tcttgcttct cgcaagtgac 

28771 ctgtctactc taagaaaacc gctcagttcg 

28841 gttcaacttg attgacgacg gtaaagtggt 

28911 actactccag aagcattcga cttggcttca 

28981 ctaccgctac aacctatctt gaaaaacacc 

29051 accactcgaa ggaaCtgacc acgcaggagt 

29121 aagctcttag caccttaatc gtttccggag 

29191 gctcgcttcg tetttaactg aacgcaattt 

29261 gaaactgttc ctcaaacaac tgaaccagtc 

29331 cggctaaaac cgttcctgag cccgtcgaac 

29401 aaaaagcgaa tatatcgacg ccccaatcaa 

29471 tgaattagtc aaaaccaata ccgacaacga 

29541 cctctagaca agcataaatc egccgcccac 

29611 cggtaaccct tggacccatc agcctaaaag 

29681 tgaccaatat aagcaagaac agcttgaaac 

29751 agggctgatg ggacatgagt tatgacgtga 

29821 tcctactaaa atcaaggtac cccgaaaccc 

29891 gcgaatgaag tcgtagcaga cgaccctgtt 

29961 attctactga cgcgggaaaa acctttgccc 

30031 aatcattcaa cgagccgata ctaccgaaat 

30101 aaccttctcg agcaagacat tttgacagaa 

30171 agtatggaag cctgaagaat CtgcCagCaa 

30241 acagtctgcg aagccgctgc tactaagatg 

30311 cagggaatgc ecgacagaaa cccaaaggag 

30381 atcacatcac acggactacg ggttttggct 

30451 gctgtagaag acaatgtcga agaacttttt 

30521 caaacgaacg acaacgatgg acagattgaa 

30591 cctccaggag tcgaattcga cgagcaagac 

30661 atagaatgcc cagcgcaaca aacagcctag 

30731 aaactcaatt attggtaccg acgaatatag 

30801 gcaacctatg cagaaaccgg tgactacttc 

30871 gaaccccaca aggaggaaac caacaacgag 

30941 gagcttgacc cattgactca gccgceaaaa 

31011 cagaactcga agccgtgacc tcggagggaa 

31081 cgtgcgtact ccagacctct cacacggcca 

31151 - atggccctaa ccgaaggcgg cacagtacgt 

31221 tegcacaagg tgcttctaat acgaaaccae 

31291 aatcgtcaac tacgtgaaaa tcactttgaa 
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aaaaggaagc cagggctgca ectttagtat tttcacgaag 
ctataaggac ggcaaaaacc tcccgaaacg cgctcggtct 
caaacagtca cacaaggcct agcaagcgga aegCccgeca 
ccgaccctaa ggcccgaaaa gattgggact ttgataagat 
caaacaecaa aacctcgaat acaatgcccc tcgacttgcc 
ggagcgagac aacggggcaa ggccaacccc catgcccgaa 
gtcgaacgtg tcaagcgtgt atcgatttag acggtgaagt 
tcccaacgga atgtgctacc aaactgcatg gtacgaaaac 
ggctgggcag acggagaacc caacgatgca tcagacgaat 
agaaatacag cgacctcgac tttgttaaaa gtcattaggc 
aactgtctaa tttcgagaac cctcgaaaag cagtaaaatg 
aaggaacctt gtcgccttaa tgactcgaaa ttggtttcac 
gccggagggc ggaaaaccca ggaggaaaat aaaeggctta 
cgaaccaact atcaaacagg cgaaggaaat tatctcgaaa 
actgacggcg acggtcaaca ttctgtacct cacgcacgct 
ccaacggctc aattaattct tacaaagaac aagtcgcgac 
tgcgcagacc accatccaaa accttcaaga gcaactcgac 
atcaccccag ctcttcatcc gctgattagc gactccattg 
accttgacaa cactaeggtc gaaagtgacg gcaaagttaa 
tgagtctcgt aaacacttat tcaaagaagt cgaagctccc 
gccgggactg gaaacttagg aaatccaggc cgcgtcggtg 
cttttggtaa gcaacttgct gctgctcaac aaacggcagg 
acaggaggaa ctaactatgc ccaatgtgcg agttaagaaa 
gccgcaattc ctgaccacta cgttgctttg gctgctcaaa 
acaagaaata cattcttgcc ggaacttgcg tgaaaaatgc 
cgaagtagta tctaccggtg aacaattcga cggagttatc 
gaaaaagCaa ccgtgacagt attagttcac ggattcgtca 
ccgcgcccga atctaaaaac gcaatgaccc ttgtcgttaa 
tcatatcaac gcaggggaga tcgctagcta cacccaagca 
ccaactcttt tccctaatgc tcaacaaaca gggacagaca 
cagtaactat ccagccatct aactacgacg cgaaagcaag 
agccactgag atggcattct tccgtgagtc tatgcgactt 
ctattgaacc aaagttcagc tcttgcccaa ccacttatca 
tagacggtgc tgaagcgcaa gcagaacaca tgcgtatgca 
aecaaccaac agcgaggccc aatacactta cgactacaac 
aagaaatgga ctaacccagc tgaaagtgac cctatcgctg 
accgtacagg tgttcgccct actcgaatgg tcttgaaccg 
ctctatcaag aaagctcttg caattggtgt tcaaggttct 
gctgagaaat tcatcgccga aaaaacaggt cttcaaatcg 
ctgacgccga caaacttcct gacgttggca acacccgtca 
actgcttcca cctgacgcag ttggtcacac ttggtacggt 
ggcggaacag acgctcaagt tcaagttctt tcaggcggac 
ctgtcaacat tgcaacagtt gtatcagctg ctacgattcc 
tctcacaact aattaggagg tcgctatatg gctacatcga 
cagcagcgca ttcagggccg gtattttctt gccctgaagc 
tgcgttcgag attaaggegg ctgaagatgg agaaacggta 
gaagaaattg acgaagttga acaaatgcgc gaagagtatg 
tagcaagagc taaeggaact gacattccct caatttctcg 
gtacgaacta ggagagtaaa atggcagccc aaacggacat 
taattctccg tcaccaatga ctgaccaaag tatctcagct 
gtcagttaca tgacttgctt aatgaagacc cggaatgacg 
gtgacgcaga ctactggaaa caaatggcgc aaccccacta 
tgatgaaaag ccgaacgctg gctcgacaac cctaatgaaa 
accacgttaa gaaccaagtt cgtagagcca ttgaaaccgc 
ttgggtcagc gacggatatg gaggaaagaa aaaggataaa 
tgcttagttg ataattcaae tgttcctgac ctcttagcca 
aaaatggagt gaaaattttc attetatatg atgaaggcaa 
caaaaaccca ggaagacggt acagggcagt agaaacccac 
cccaaatcgg aggcgaacga cCaacgtccc agcctgaatt 
ctgtgaacgg tatcgaaaca agtttcaagt cgctgtcaca 
gaagaatacg caaagacgca tgctattcgg acagaccgta 
aagccgctcg ggtaagcgca gaccaaacca cgacagccgt 
agaaccagct catggtcgaa aacacaaaat tcccgaacag 
agagcgttga gaaggttatt agactaggag tgaacatgac 
ggaaaccccc cctacatttc agctctcgcc tgctcctacg 
acagataggc cggatgacta caccgttctt cgatatagcc 
gaagttttgc ttattggaaa gttcaaacct acgcccatcc 
cagaaaggtt cgaaacatta tcaaggacac gggctacgaa 
gacacaacgc tttctagata ccgactagaa aecgaatata 
taaagacatt ccttacggaa tcaagcccgt gcaaarcgag 
gtcggcggag ctaaccttgt cgtagacacg gcagaaacag 
ctgaagatgt gaaacgcaat gacacgcgca ttcttgctat 
cgacctaaca cccaaggaca acacgtttga ccctgaaatc 
caacaaggcg gaactattgc cggatacgac accccaatgc 
tcagaatgaa catctatgtg ccaaaccacg caggcgactc 
taacegcacc ggtaaagctc cagggcttec aaccgggaaa 



WO 00/32825 



PCT/IB99/02040 



31361 gagttceacg ctcctgagcc caacaccaag 

31431 cggactacgt ggcacaactc ccagcggttc 

31501 cgccgacgca gtecgagttg aagcaggcaa 

31571 aaggctttca aaggctggaa agtcgaagga 

31641 accgagacgt caaaetcgta gcacaactcg 

31711 atcacagctg agcagtttaa gcaactcgca 

31781 aacctatcca cgtcaaaatt cgagcagcag 

318S1 gcttttaggt aaagtgacag aactgtttgg 

31921 tcaattactg accaacagaa gaaagaagcg 

31991 tggccgaact ectccgagta ttcgcagaag 

32061 tatgacagat gagcaactta tgacaaccct 

32131 cgtacagacg aaggaaatgt ctaatgtcat 

32201 gtcgggatgc aaactgatct aggcaaatac 

32271 aggaagacaa gactcctagg catcctggtg 

32341 actattetca gtcgctcctc tttttgtaca 

32411 aaacgacttt ggacatctca aacttcacaa 

32481 accagagtct tcgaagccct ttcaaattgg 

32551 gttaccctcc ctcttatggg atttgcagcc 

32621 cccgcgttca agctattgca ggagcgacag 

32691 tggtgctaaa actgctttta gtgcaaaaga 

32761 caggtaaacg aaatcacgga cgctacgcca 

32831 ccgcgagccc cgaggccatg gctagttcac 

32901 ggctgacgca ettgctcgag cagcagctga 

32971 tacgtegcac ccgttgctca ctctacgggc 

33041 ccgacgccgg tattaagggc tcgcaagccg 

33111 tacgaaagcg atggtcaaat caatgcagga 

33181 ccactaagag aacaaatcgc tcaactgaaa 

33251 accttgttac cttgtatggc caaaacccgt 

33321 attggataag atgaccaacg ccctcgtgaa 

33391 gacaaccttg ctagtaaaat cgagcaaatg 

33461 tccttgagcc tgcacttgct aaaatcgtgg 

33S31 acccatcggt caaaagatgg ctgtcatatt 

33601 gcaggaatgg tgatgacaac tattgtcaag 

33671 gaacgatggg aaccattgca ggagttatag 

33741 cacaaaatcg gagagattta gaaactttat 

33811 gcgttggaac ggctacctcc acgaccgaaa 

33881 aagagttcgg tcagtctgta gggtctaaag 

33951 ggcaggaggc tcgactggtc agttcattgg 

34021 ggaggagcca cttcaattgc tgttecactt 

34091 cacccgggat tgctattagt ctgttagttt 

34161 agacggaact actcaagtat ccgaaaactt 

34231 taccttccag tcttcgtcga aaaaggaact 

34301 ttcctcaagt agttgaagtg atttcacaag 

34371 tcaattagtc gaagcaggaa ctaagatact 

34441 atcattcaag cagccgttea aattatcact 

34511 ttcaagcagg tcttcaaatt ctgccagccc 

34581 agcagctgct caaactatca tgccgcctgt 

34651 gcgacgcaga ttataatggg cctagtcaac 

34721 ttcaaattct aatggcttta atcgagggac 

34791 aatcattact tcactattag aagcaatctt 

34861 cttttaecac ttcctcaagg gttgccaaat 

34 931 tggcacttct taaagcagtt atcgacttcg 

35001 actgaetcaa ggtattgctt cactectegg 

35071 gttagcaaga ttgctagctt cgcgggacag 

35141 gtggtattgg gtcaatgatt ggttcagctg 

35211 ggttactgga ttcgctggac aaatggtaag 

35281 agctccatgg taagttctgc ggtaagtgcg 

35351 gattcttagg cactcactct cctccacgtg 

35421 aaatggtatt ggtaacatga ttcgaaccac 

35491 gctctcagcg acgtgaagat ggacattcaa 

35561 agatggctga ccaacctcct gaaactcttc 

35631 gcctcgagtg gacttgttca atacaggaag 

35701 caaggcgagc aaaccgttgt caacattgga 

3S771 cgagaggatt gtataataga agcaaagaaa 

3 5841 aaatagatgg ctagcagaca gacgctattg 

35911 tagaatatgt aggactcact ttcgcaggat 

35981 agcattagac tctccgtcca atgctacgtc 

36051 accgaaaagc aagccaacca aaaatacagg 

36121 ttccgacact tgaagaccct ggatactatc 

36191 tgnagacgtc caagccttta aagatacetc 

36261 gagtacagcg actcaacegt tegaaaggtt 

36331 acccaggaag acctacccga caatttagag 

36401 aaccggcgaa aaaagttcag gacagcttgt 

364 71 atcattattc taaatcttgg aacetttgaa 

36541 ttagatacat taaacgaggc gcattcttca 

36611 agccgatgac gcageagcct ggacctctac 
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gcacgcgaag caaccaaagc aggtttgcca gttaagtcaa 
ttcgtcgcgt gacactcgac ctgaacggtg gaacaggaac 
gaagatctct ccaaaaccag ttgaccccac cctaacaggc 
gaaccaacta tttgggaccc cgacaaccac atgacgcctg 
cacagaaatt cagaaagaag ggtctgccat gactaatatt 
ccccaaatca ccgcactccc aggatttcca aaaggtagtg 
gegtcatgaa cctaatcgct aacgggaaaa tccctaacac 
agaaacttcg acagtcacta aagacaatgc tagtctagca 
cccgaccgat cgaacaaaac cgacaccggc attcaagaca 
cttcaatgge agagcccact tacgetgaag tcggcgagta 
cagtgcaatg tacggtgaag tgactcaagc tgaaaccttt 
agcagtcgct actgaatttc atattagacc tagcgaggtg 
tgcttcgacg cagcagccgt tgcttaeatt agatatttgc 
acgaaaagaa aaacccagga ttgcaaatgc ttatggagtg 
tagaaaggaa ateacatgga ttttgggcca attgcagcaa 
gtcaattaaa tcttgctcaa agtcaagcgc aacggctcgc 
ttctgcttta acaggactag ggaaaggacc tacgactgcg 
gcctctatta aagtagggaa tgaatcccaa gctcaaatgt 
cggaagagct tggtagaatg aagactcaag caaccgacct 
ggcggctcaa ggtatggaaa atctagcttc agccggtttc 
ggggtacttg acctggccgc cgtatctgga ggagacgtgg 
ttegagcetc tggatcagag gcaaaccagg cgggtcacgt 
cacgaacgca gaaactagcg acatggcaga ggcgacgaaa 
ctgagccttg aagaaacggc tgcgtccatt gggattatgg 
gaaccacgcc tagaggcgct ctctcgcgca tcgccaaacc 
attaggagtt tcgttctacg acgcgaacgg aaacatgatt 
acagctactg caggactaac acaagaggaa cgaaatcgec 
tgccaggtat gcttgcacca ttagacgcag gtcctgagaa 
ctcggacgga gccgccaagg aaatggcaga aactatgcag 
ggaggagcte tcgagtctgt tgctattatt gttcaacaaa 
gagcaatcac aaaagctccc gaagcattcg taaatatgtc 
cgcaggaatg gttgcagccc ttggaccact gcttctaact 
ttaagaattg ctattcagtt tttaggtcca gcatttatgg 
caatattcta tgctccggtc gccgtgtcca tgatagccta 
caacagcctt gcgcccgcta ttaaagctgg gtctggagga 
gagccaggag aatggttaca gaaggcaggc gagaaggcga 
tgtcaaaact gctcgaacag tttggaataa gtatcggtca 
aaatgttctc gaaaggctag gaggcgcact tggaaaagta 
gtaacaaaat tcggtctcgc acttccaggg attacaggac 
catttttgac agcttgggct agaacaggtg agttcaacgc 
gacaaacaca attcagtcga cggctgattt catctcccaa 
caaattttag ttaagattat tgaaggaatc gcatctgctg 
tcattgaaaa cattgtgacg acaatttcga cagttatgcc 
cgaagcgctt ataaacggtc ttgttcaatc tcttcctact 
gctttattca acggtettgt tcaggcactt cctacgctta 
tcataaacgg accagttcaa gcgctcccgg caattattca 
tcaagcacta attgaaaacc tgcctatgat aatcgaagca 
gcactgattg aaaatatagg acctatctta gaagcaggga 
ttatccaagt gcttcctgaa ctaaetacag cagcgactca 
gtcgaacctt cctcaacctc tagaagccgg agttaaattg 
atgcttcccc aactaattge aggggctttg caaatcatga 
tccctaaact tcttcaagca ggtgttcaac ttcttaaggc 
ctcactttta tcgacagccg gaaacatgct cccaccatta 
atggttccag gaggtgcgaa cctgattcga aacctcatta 
tctctaaaat tggcagcatg ggaacttcaa ttgtttctaa 
cgcaggggtc aaccttgctc gaggatttat caatggtatc 
gcggccaata tggctagcag tgcattaaat gccgttaagg 
tcacggagca gacgggtatc tatacgggtc aagggttcgt 
acgcgacaag gctaaagaaa tggctgaaac tgttactgaa 
gaaaacggag ttatagaaaa ggttaaatca gtttacgaaa 
cagctccega Ctccgaagat gttcgcaaag cagccggttc 
tgacaaccct aaccaaectc agtcacaacc caaaaacaat 
acaatcgtag tttcgaaacaa tgacgacgct gacaaactgt 
ctctatcagg gtttggtaac attgtaacac cgtaaaggag 
gccgacggaa ttgaccttgt cgacaaaggc gcaaccgcgc 
ttaaggactc aggatttaaa aaccctgaag gcacagacgg 
cgcccttact ggaagcgtga ccttaatgtt ccacggagaa 
cagctcaaac aatttatccg cccgaagtca ttttggagaa 
gaacgggaaa atttttagga gaaaccgagc aaggaaaASt" 
cctcgtagtc aaattaggga ttcagttcaa agacgcttac 
tataagtttc aacccgcttt gggaggcgat agctcaccca 
tagaaacaag aaccacttcc caaaccaaag gatactttcg 
tgagcccggc actaattcag tattgatgga aagtggcccg 
ctcaetaaaa ttagcagcgc aaatcaagcg actaacttat 
agattcccaa cggaaaetca acaaccacca tcgaacaccg 
tctccccgct caagccgaac tgttnetaaa tccgccttac 
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36681 tattagaaag ggaacacacg attgacaata 

367S1 atatgaccaa aacctcaacc caaccggagc 

36821 gtgactcgag cccgaggaaa agaaactttc 

36891 taaaggttga aaacatcatc cagtatggag 

36961 tgccaaaggg cctaccaagt ttacctgcca 

37031 ttgaaacacg ctgcttcttc tgtaggcgct 

37101 gactagcetg tcctcctgac ggcgctaaca 

37171 ttggcatctt cgatatcttg caaagcaata 

37241 caagaggtta gaactgttca aaccgctgca 

37311 ttgtagtcga agagaatttg aaatatgtca 

37381 gttgacaggt aaaaaggaag aaggcagtca 

37451 tatctcaetg acgntccgtg gcctactaca 

37521 acgaacactt tagaactaaa gaaaatttga 

37591 actaattgga tatgaggctc cagcggtcct 

37661 gtcgacgacc attatgatgt tatcgagcgg 

37731 caaaccccac eatcattttc caagaceccc 

37801 agtcctttca ggggaaactg taaatgagtc 

37871 aattctaatg cagaatctgg gaaatacact 

37941 ccgacgactt cacatggatt cgactagaag 

38011 tgatggagtc gacggtgtac ctggaaagag 

38081 tccgtttccg gaacgcaaga gcccgaaaat 

38151 tctcgtggac caaaacatct tggagacata 

38221 agggcaagac ggaaattccg gaaaagacgg 

38291 gtcatgtatg caagttcgcc atctgctacc 

38361 tcccaggtgg tcagtatcta tggactcgaa 

38431 ttcagtttca agaatgggcg agcagggtcc 

38501 ggaacagggt tgaagtcaac ttcagtttct 

38571 gggcttcaca agctcctcct ttaaccaaag 

38641 ttcaactacc gaaacgggct accaaaaaac 

38711 gctggtaagg atggggtagg aatcaagtct 

38781 cgcctactcc aaaccggact cctgctattc 

38851 ttggaactat actgacgaca ctagcgaaac 

38921 ggagttcaag gtctecaagg tcctcaaggg 

38991 cgcaatatac tcacctcgct ttctccaata 

39061 agcatacgtc ggtcagtatc aagatttcaa 

39131 aaatggaagg ggaatgacgg agctcaaggg 

39201 tccatatagc ttacgcetca agtgcagacg 

39271 tatgggttat tactccgatt atgagcaagc 

39341 cttgccaatg ttcaagtggg aggccgaaac 

39411 gctattctag ttacaatcta acggacggac 

39481 acgtcaacgg ttcaaaggtg ctaactcttt 

39551 aaactgacct ectcettagg aggagaeacg 

39621 gtatcagtct ccgggccaag gcctctagga 

39691 cgtatttacc gcaacctcaa ccgatc&atg 

39761 aattgtaccg ctgaagcaat tttccacgta 

39831 tcgaacttgg taatatctct actcctttta 

39901 agccgatcaa aagccaacca accaacagrt 

39971 ctgaaagcta aggccacaae ggagcagtta 

4 0 041 atgaagaage tatcaaaaaa ccggaagccg 

40111 agaacttggc gggctacggg aactgaagaa 

40181 attatcggta agaacgacgg cagccccacc 

40251 ggaatgaagc tatgtaccct acgcaagggc 

40321 agtcggccga tttagaacgg aacaatactc 

40391 ggagaacaac atgacaaaat ttatcaactc 

40461 agtcaggacg taacgaacaa ctcctcgcga 

40531 gaacgtggac ttatggaaat attagtaacc 

40601 cccagactac gacacgtccg gcgaagaggt 

40671 gacgggacaa agacaatgcc cgtctgggct 

40741 tccctactaa ctacacccca gacagtactc 

40811 tctaggatct ttacatacgg tcacccttaa 

4 0881 gttttcggca gcgactggat agatttaggt 

40951 acttagcaag gtactcacct aaatcaagtt 

41021 tacgcaaatt ggtagcgacg tctacccaaa 

41091 ttttcgggca tttctttagt agacacgact 

41161 aaatcatgtc gaacactcaa gecaacttea 

41231 tcacgctgag ctcgeaggta aaaaccaagc 

41301 aatggctccg ctaccgtaag agcatgggcc 

41371 ctatcaatgt tatagaatac tatggaccgt 

41441 aaccatccaa gctcttcgaa acgctaaggt 

41511 caaactaccc cceccgtggc gccgttgaac 

41581 cgttcactac tacttcccta atgactaacc 

41651 ttacatagtt aaggctaaaa tccaagacag 

41721 ecagtagttc ttaactatga caaggacggc 

41791 ggtcaaccga tgcagcaggt gatacatatg 

41861 taatggagca ttgaacaggg gtcaatataa 

41931 agtaacaaac acgaggacaa ccccacggga 



.o5 

acttacccat gagtccaacc cctggcgaaa tcgcccaagc 
aagcgacgaa atcttcagca agcactacga agacgaaatt 
acttttgaaa gcatcgaaac cccatctatc tatcaacact 
gaagatggtt tcgaaccaaa tatgctcagg acgtagaaga 
cgcatcatgg catgaactag cagaaggctt gcctaggaag 
gccgcgctag atattatcaa agacgcaggc gaacgggtcc 
aacaagctcg aagcacaaca gccgcagaaa aeecaatgct 
caatttagaa ttgacatttg gttatgaaga aactatcaag 
tttcttcagc cctacgtcga gtccaaagta gactttcctc 
ctaggcagga agattctcga aacctgtgta cggctcacaa 
agagcctcca acgtttgctt ccatcaacaa cggaagtgaa 
cgccacatga agccccgaca taetgctaaa tctaaaagcg 
tgagcgctgc gcgtgcctat cctgacatct acagtcgccc 
tcataacaag gttccegact egcatcatac tcaactaatt 
cgaaagatat ctgctcgaaa aattgactac gacgaccttt 
gaaaagactt gatggacttg ctaaatgagg acggcgaagg 
ccaagttgtt actagatacg cagatgacat tttagggact 
ggtgtcctta acaccaacaa gaaaccgagc gaattagtec 
gccccaaagg tgacgcaggt ttaccgggag ctcctgggcg 
cggagcaggg atagcagata cagctatcac ttatgcegta 
ggatggagcg aacaagtccc egaactcata aaaggtcgat 
ctgacggccc acatgaaact ggacactccg ctgcctatac 
aaccgcaggt aaggacggag taggtatagc cgcaaccgaa 
gaagccccag ctggtggacg gtctacgcaa gttcctaccg 
caagatggcg ccacactgac caaactgatg aaactggaca 
taaaggtgac gcaggtcgtg acggtattgc aggaaagaac 
taeggaatta gtcccactga cectgcgatt cctggagtat 
gtcaatatct ttggactcga actaCttgga cctataccga 
ctacaetcca aaagacggga atgacggtaa aaatggaatt 
acgaccacta cccacgcagg ctcaacctca ggaacagttg 
caaatgttca accgggattc ttctcgtgga cgaaaactgc 
aggtcactca gtttccaaga caggtgaaac aggccccaga 
cttcaaggaa ctcctggacc tgcaggagct gacggacgtt 
gtccaaacgg cgagggattt agtcatactg acagcggacg 
tcccgtccat tcaaaagacc ctgcagccta tacatggacg 
atacccggga agccaggcgc agacggtaag actaattatt 
gatcacgtga gttcagtttg gaagataata atcaacaata 
agacagcagg gatcgaacta agtatcgatg gcttgaccgc 
gagttcctta attctttatt tgaatttggt ttaaaacctq 
aagatcaaac gcaaggacag atatctgcta ctattgacga 
acgacccgac tcaacatgga acggtaaacc gcagaaccaa 
cgattaggta ctccaaccga gtggtctaat ttagaaggtc 
acggagtgag ctcagctgca cggccgggct aecgtagtaa 
gaagttctac gattttaaat tctttgacaa agttaatcca 
ttcactcaaa gtcgttcagt gtggctcaat catatcaaaa 
gtgaagcaga ggaagaccrt aaatatcgaa ttgactcaaa 
gacggcactc acggaaaagg ctcaactaca egacgcagaa 
agtaacttag aaaaggctta tgaaggtaga atgaaagcta 
acccaatcct agcggcaagt cgaattgaag ctactatcca 
gttegtcgac agttacacga gctctcccaa tgaaggtcta 
accaaggtat caagtgaccg aattcctaeg ttctccgcag 
tcattcacat cgataacggg atcttcaccc aatccactca 
gtteaatcca gacacgaacg tgattcggta tgtaggataa 
atacggccct cttcacttga accctcacgt cgaacaagtt 
gttagttggc gagctactgt cgaccgcgat ggagcttatc 
ctcccgcatg gctaaacggt tcaagcgttc acagcagtca 
aacgctcgca agtggagaag tgactgttcc tcacaatagt 
tcgttcgacc ccaataaegg cgctcacgga aatatcacta 
caaggtccac acagatttct agccttgagg gaaaccgaaa 
ccgaaaagcg aaccctttta cgcaccaagt ttggtaccga 
aagaaccata ccactagcgt atcctttacg ccgtcactgg 
ccggaacaac ggacatctgt acccgaaccc acaacggaac 
cggacggagg cccaacatcc ccgacccagt acgecctacc 
tcagcggttc gacagatttt aacagggaac aacttcctcc 
acaacgcttc cggcgcttac ggatccacta tccaagcatt 
tatcaacgaa aacggcggca aattgggtat gatgaacttt 
acagacacgc gaggaaaaca atcgaacgtc caagacgtat 
ctatcaatct ctccgttcaa cgtactcgtc aaaatcctgc , 
cgcacctata acggcaggag gtcaacagaa aaacatcafcg " 
accaccaatt ccacagaaga cagaggttcg gcgteaggga 
cgtccgcgaa ctcagccggt aactacgggc cggacaagtc 
gctcactccg actgaarcta gtgctacggt agctaccgaa 
cgacctggag ttggtaaggt tgtagaacaa gggaaggcag 
ctggaggccg acaagttcaa cagtttcagc ccaccgataa 
cgatgttcgg aataagcgtg aaacagagtt tacacggcga 
actcgaggcg aatggggact atctcaaaac ttccggttag 
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42001 atagctggaa aatggctcaa tccttcatta caatgccagg aagaacgttc atcaggacag cgaacgatgg 

4 2071 aaacagctgg agacctaaca agtggaaaga ggttctattt aagcaagact tcgaacagaa taattggcag 

42141 aaacttgttc tccaaagtgg gtggaaccat cactcaacct atggcgacgc attctactcg aaaactcttg 

4 2211 acggcatagt atacttgaga ggaaatgcgc ataaaggact tatcgacaaa gaggctacta tcgcagtact 

42281 tcctgaagga tttagaccga aagtttcaac gcatctccag gccctcaaca actcatatgg aaacgccatt 

42351 ctatgtatat acactgacgg aagacttgtg gtgaaatcga atgtagataa tccttggcta aatttagaca 

42421 atgtctcatt ccgtatttaa tttgagctga aatcatgtta taatattttt tagaaaggag gtgagaacta 

42491 tgttgaaccc cacaaaatcg cgccaaattg tggcagagtt cactattgga caaggagctg aaaagaaact 

42561 tgtcaaaaca acgatcgtga acatcgacgc aaacgcagta ccaaccgtct ctgaaactct tcatgaccca 

42631 gacttgtatg ctgcgaaccg tcgagaactt cgagctgacg agcaaaaact tcgcgaaacc cgttacgcaa 

42701 tcgaagatga aattctagct gaacagtcaa agactgaaac agctctaaca gctgaataag gaggcgtcaa 

42771 tctatgccaa tgtggccaaa cgacacagca gtctcgacga cgattattac agcgtgcagc ggagtgctta 

42841 ctgtcctact aaataagtta ctcgaatgga aatcgaataa agccaagagc gtcttagagg atatctctac 

42911 aactcttagc actcttaaac agcaggtcga cgggattgac caaacgacag tagcaaccaa ccaccaaaat 

42981 gacgtcattc aagacggaac cagaaaaact caacgttacc gtcttcatca cgacttaaaa agggaagtga 

43051 taacaggcta tacaactctc gaccatttta gagagctctc tattttattc gaaagttaca agaaccttgg 

43121 cggaaatggt gaagttgaag ccttgtatga aaaatacaag aaattaccaa ttagggagga agatttagat 

43191 gaaactatct aacgaacaat atgacgtagc aaagaacgtg gtaaccgtag tcgttccagc agcgattgca 

43261 ctaattacag gtcttggagc gttgtatcaa tttgacacta ctgctatcac aggaaccatt gcacttcttg 

43331 caacttttgc aggtactgtt ctaggagttt ctagccgaaa ctaccaaaag gaacaagaag ctcaaaacaa 

434 01 tgaggtggaa taatgggagt cgatattgaa aaaggcgttg cgtggatgca ggcccgaaag ggtcgagtat 

43471 cttatagcat ggactttcga gacggtcctg atagctatga ctgctcaagt tctatgtacc atgctctccg 

43541 ctcagccgga gcttcaagtg ctggatgggc agtcaatact gagtacatgc acgcatggct tattgaaaae 

43611 ggttatgaac taattagtga aaatgctccg tgggatgcta aacgaggcga catcttcatc tggggacgca 

43681 aaggtgctag cgcaggcgct ggaggtcata cagggatgtt cattgacagt gataacatca ttcactgcaa 

43751 ctacgcctac gacggaattt ccgtcaacga ccacgatgag cgttggtact atgcaggtca accttactac 

43821 tacgtctatc gcttgactaa cgcaaatgct caaccggctg agaagaaact tggctggcag aaagatgcta 

4 3891 ccggtttctg gtacgctcga gcaaacggaa cttatccaaa agatgagttc gagtatatcg aagaaaacaa 

43961 gtcttggttc taccttgacg accaaggcta catgctcgct gagaaatggt tgaaacatac tgacggaaat 

44031 tggtattggt tcgaccgtga cggatacatg gctacgtcat ggaaacggat tggcgagcca tggtactact 

44101 tcaatcgcga tggttcaatg gtaaccggtt ggattaagta ttacgataat tggtattatt gtgatgctac 

44171 caacggcgac atgaaatcga atgcgtttat ccgttataac gacggctggt atctactatt accggacgga 

44241 cgtctggcag ataaacctca attcaccgta gagccggacg ggctcattac tgctaaagtt taaaatatag 

44311 agaggaggaa gctettttct taatattgtt tctcttaatc ccgcaaggtt tcgaccctgc ggggttttgt 

44381 gtcgtatatt accctattta cttattcgaa gatttcaatt ataattaaat agtcaacatg attcatgatt 

444 51 gttgatatga ccctttccgc cctacataat ttgtggggcg tttatttttt ataaaaattt tttacaaaat 

44S21 gcttgacaac attcactcat tatcgtataa tacaattata aaaataaata aagccgaaag gcgaggagga 

44591 cattatgtca aaaattaaat tcgaaaacct taaaaaaggc gatgttgtgc tacgagctaa atctcaaacg 

44661 aagtttaaaa tcgtttcaat tttagcagac gaaaagaaag cagaccttga atcattagaa gacggaggtg 

44731 aacttcacct ttcagcttca actctcgaac gttggtacac aatggaagat gaaaccgaac ctaaaaaaga 

44801 agaagctgct aaacctgcta aaaaggctgc tcctgcagtt gctcgacctg ctcgaaaagg tagagtcgtt 

44871 cccaaaccta aaaaagaagt ccttgaggaa gaaattcctg aagttaagga acagccggaa gaagttggtt 

4*941 cagttagtga gaaatctact gttcgaaaac ctgctcctaa aaaagaaagc gtgatggcga ttactaaggc 

45011 tcttgaaagt cgaattgttg aagcctttcc tgcgtctact cgaatcgtca ctcagtctta catcgcctat 

45081 cgctctaaga agaacttcgt tactatcgaa gaaactcgaa aaggtgtttc tattggagtt cgcgcaaaag 

45151 ggtcgacaga agaccaaaag aaacttcttg catctattgc tcctgcatct tacgaatggg cgattgacgg 

4 5221 aatttttaaa ctcgtcaagg aagaagatat tgacaccgca atggaattga ttgaagcttc tcacctttct 

4 5291 tcgctatgat tgaaatcgtt atagcacgtt cgaaagctag gcgaggtcga accctattta ttgaaacatg 

4 5361 ggcaagcact gatgaagatg cagttaaaat ggcagaaaag atttccagct tgcccaatgt agtcgagacg 

45431 tcttctaata acttcgaact accttataag tatttcaata atgttataga cgctctagat gaatgggagc 

45501 ttcacatctt cggcgaactt gataaagatg ttcaagacta cattgactct cgaaaccgaa tagcttcttc 

45571 aagcaatgag cagttttcgt tcaagactac tccattcgcg caccaggttg aatgtttcga atacgcacaa 

4 5641 gagcatccat gtttcctctt aggcgatgag caaggtttag ggaaaactaa acaggcaatt gatattgcag 

45711 ttagcaggaa ggcaagtttc aaacattgtt taatcgtatg ttgcatatca gggctcaaat ggaattgggc 

45781 aaaagaagta ggtattcatt caaatgagtc agctcatatt ttaggaagtc gagtcactaa agatgggaaa 

45851 ttagtgattg acggagtttc taaacgggca gaagactcgc ttggtggcca cgacgaactc ttccttatca 

45921 ctaacattga aactcttcgc gatgctgtgt tcattaaata cttaaatgaa ctgacaaaaa gcggagaaat 

4S991 tggaatggtt attattgacg agattcacaa gtgtaagaac ccttcaagta agcaaggggc ttcaattcaa 

46061 aagctccaaa gttattacaa gatgggactt acaggaactc ctctaatgaa taacccaatc gatgtattca 
46131 * atgttatgaa gtggctaggg gcggaacatc atacactgac teagttcaaa gagcgacact gtatcgtcga 

46201 ccagttcaat caaatcactg gatatcgaaa tctagctgaa cttcgcgagc ttgtcaacga ctacatgctt 

46271 agaagaacga aggaagaagt tttagacctg cctgaaaaga ttcgagtcac agagtatgtc gacatgaact 

46341 cgaaacagtc aaaaatctat aaggaagttt tgactaaact tgttcaagaa atagataaag tcaagctcat 

46411 gcctaaccct ctagccgaaa cgattcgact tcgacaagcg actggaaatc cttcgacttt aactactcaa 

46481 gatgtcaagt cttgcaagtt cgaaagatgt atcgaaattg tcgaggaatg tatccagcaa ggaaagtcct 

46551 gcgtgatatt tagcaattgg gaaaaggtta ttgaacctct tgctaagata ctttcgaaga cagtcaaacg 

46621 caacctggta acaggagaaa ccgcagataa gttcaacgaa attgaagaat ttatgaatca cagaaaggct 

46691 tctgttattt taggaactat aggtgcgcta ggaacaggat ttactttgac gaaagcggat acggttattt 

46761 tcttagatag tccgtggaca cgcgcagaaa aggaccaagc cgaagatagg tgccatagaa teggegcaaa - 

46831 aagttctgtc actatetaca cgcttgtcgc caaaggtact gttgacgaac gtatagaaga ccttafctgaa 

46901 cggaaaggag aatcagcaga ttatatcgta gatggtaagc ctatgaaatc taaaattggt aaccttttcg 

46971 atatcctgct taaatagaat gaaaactatc tccatattaa ggaaagacac taaaaggaag ccggacagga 

47041 acggaagaaa aactgcactc gaactagctc aagagattga tatgtcacct agtgagttag cagagctcct 

47111 tcaaattcct gaaaggacgg caaceagaat tttaaaactc gacaaactgc tcaacaaaga gcaatgctca 

471B1 ataatagaaa ggtatataaa tgaaattcac tgaaggaaaa aattggtata aagttggaga gatatgtcaa 

47251 atgttgaacc gctctctatc tacgattaat gtttggtatg aagcaaaaga cttcgctgaa gaaaacaaca 
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4 7321 ttcacttccc gtttgttctt cccgaaccca gaacagacct tgaccatcgt ggttcccgac tctgggatga 

4 7391 cgaaggcgtg aacaaactca aacgacctag ggacaaccca atgcgcggtg acttggcatt ctacactcga 

4 7461 actctcgtag ggaaaactga aagggaagca attcaagaag acgccaaagc attcaaacgc gaacatggat 

4 7531 tggagaatta aatgaaattt gaagacgaaa aacagttcat cgccgcaacc gaagaagccg gcgaattaaa 

4 7601 tgctaccaaa ggcgacatgg agaaacaagt caaaagtctt cgtgatgctc taaaagagca catgaaagaa 

4 7671 aatgacaccg aatctgctca aggcaagcac ttttctgcta ccttccacac gacagagcgc tcaactatgg 

4 7741 acgaagaacg ctcgaaagaa attatcgaaa aatcagttga cgaagccgag acggaagaaa tgtgtgaaaa 

4 7811 actttcaggg cttatcgaat acaagcctgt catcaatacg aaacttctcg aggatatgat ttatcacggc 

47881 gagattgacc aagaagcaat tcttccagca gttgtcattt ctgttaqaga aggcattcgt cttggaaagg 

4 7951 ctaaaattta gcgatatttt tggttctgcg acgtttttag ggttagcaga atccaatcac accacttgcg 

4 8021 caggcaaccg ctgtctgegt taatcttaga aggttaatat tacaccataa ggaggagata agtggcaagg 

4 8091 caaagaatag gcaattcagg aaagcctaaa aacgaaattg aactaacatt caaagacaag cctaaaactc 

48161 gttctacctt attcaagaag gacgtggcaa caggtctttc aaaagtcgag qatgattatt ttcaaatagt 

48231 tgaagcactt aacggaaaac aatccgaacc taatatgaag caggtgtcac ctttctttat agetcagtat 

48301 gaatttattt tcaatatcaa gtgcatcgat tataactggc ccaactcttc gagcactatg aaaaatgrcc 

4 8371 gaacttattt aaacattgag tcgaacattg aactctgtcg atttttagct gaaagttttg ttaaatatga 

48441 aaatgctcga aaaagattga acccaagcga aaggttcata acggtctcga ctttcaaaag agcctggatt 

48511 ttggacgaac tcgaaggaaa aacgggttca aaattcgaag gattttatta gtttagtaga ctatctttag 

48581 attttttaaa atgtggttta caaaatgacc tcaataggcg tataatttat caaccttgat tctttcgggc 

4 86S1 cggtatatat acaccaacaa tcgagaaata ataaattata gtatcgaaaa tataaaaagg agaaaagttg 

4 8721 gaaaatttag ctgatagaat atggaagaaa aagttaaatg accttttcga gagaagtggg ctacctcaaa 

4 8791 agtatttcga acctcaagtg ttagtcgaac gaaaagccga caaggaatgt tgggaatggc tagaagctgt 

48861 tcgagcaaat acagtcgaag aagttcgaaa cggtcttagc attgttattg cttcgaatac tgtcgggaat 

48931 gggaaaacta gctgggcggt tcgacttttg eaacgctatt tagcagaaac tgcacttgac ggaagaattg 

49001 ttgagaaagg aatgtttgta gtgteagctc aactattgac tgagttcggc gactataatt attttcaaac 

4 9071 cacgcaagaa tttctcgaac gtttcgagcg ccttaagact tgtgagctat tagtcataga cgaaataggt 

49141 ggaggctcct taaccaaggc ctcttatcct tatctgtatg acttggttaa ttatagggcc gacaataact 

49211 tgtcgactat ttatacgact aattataceg acgatgaaat tattgacctt ttaggccaaa ggctttatag 

49281 tcgtatatae gatacttcag tggttctaga ttttcaggca agcaatgtaa gaggattgga ggtaagcgaa 

4 9351 attgaatcac agatatagta acatcacaac tatttttctt tggcagattg tctttctctg tatttgctgc 

49421 gcggtgtcct attgtgcagg agtgcataac gagcgagagt ctcaagataa ggtgattcaa agttataagc 

4 9491 agaaagaaaa gccagccgtc tacttgacag tcgatagttc aggagcttgg ctaggaagcg ctccgggagc 

4 9561 caaggaaagt cctctctaca atgaaaaggg acagcatgta ggaaaattga aagaggtggg agagtgatac 

49631 agctccaagc cttaaataaa gttctcgaag aaaagagctt atccatttta gaaaataatg gaattgacca 

49701 agaatacccc acggattatt tagacgagta tcaatttatt caagaacact tttcgagata tggaagagtt 

49771 ccggacgacg aaactattct egaccatttt cctggattcg aatttttcga aattggcgaa actgatgaat 

49841 accttatcga caagctaaaa gaggagcatc tatataattc acttgttcca attttaacgg aagcggctga 

49911 ggacattcaa gtagatagta acattgcgat tgcgaatata atcccaaaac tagaagaact tttcaatcgc 

49981 tctaaattcg taggcggact agacattgct cgaaatgcta aacttcgact agactgggcg aaCactatta 

S0051 gaaaccatga cggtgaaaga ettggaatat cgacagggtt tgaactattg gacgacgtgc ttggaggctt 

50121 acttcctggt gaggatttga ttgtcataat ggctcgacct ggacaaggta agecgtggac tattgataaa 

50191 atgcttgcaa ctgcttggaa gaacgggcat gatgtccttc tatatagcgg ggaaacgagt gaaatgcaag 

50261 ttggtgctcg tatagatact attctttcga atgttagcat caattcaatc accaaaggga tttggaacga 

S0331 ccatcagttc gaaaaatatg aggaceatat tcaagcaatg actgaggctg aaaattccct tgtggtagtc 

50401 acgcccttta tgattggagg aaagaacccc acccctgcaa ttttagatag cacgatacct aaatatagac 

50471 catctgtggt ggggattgac cagctttcac ccatgagcga gtcttatcca agcagggagc agaagcgaat 

50541 ccagtacgcc aacatcacca tggacctata taagattcct gctaaacatg gaattcctac tgtgcttaat 

50611 gcccaagcag ggcgctcggc taaaactgaa ggcgctgaaa gtatggaact agaacacaca gcagaaagtg 

50681 atggagtagg tcaaaatgct agcagagtta tcgctatgaa gcgtgacgaa aaacccggca cacttgaacc 

50751 atctgtcgtt aaaaaccgac atggcgaaga ccgaaaaatc atcgaatata tgtgggacgt tgaaactgga 

50821 acccacactc ttataggatt caaagaggaa ggcgaagaag gaactgaaaa aggcgaaagc tctccattga 

50891 aagcaaaagc ctctaggtcg actgctcgtc ttcgaagtaa ggttacaagg gaaggagttg aagcattttg 

S0961 atgaaagtaa atggtcttca aattgaagcg actcctgaac aaataattga aaaactttcg agacaacttg 

51031 aagacgaagg aacattcact tttagacgaa ctaagtcgct tggaagcaac tatcaattcc catgcccgtt 

51101 tcatgcagga gggactgaaa agcacccctc ttgtggcatg agtagaaatc cttcttattc aggaagtaag 

51171 gtgacggaag ctggaacggt tcactgtttc acttgcggct acacttcagg accaactgaa ctcgtctcga 

S1241 acgtactagg tcgaaacgat ggagggttcc atggaaacca gtggctgaaa aggaactctg gaacatccag 

51311 cgaagtagtt aggcaaggcg tcagccctga agcgtttcga agaaatggga gaactgaaaa agtcgagcat 

51381 aaaatcaccc ctgaagagga acttgataaa taccggttta ttcatcctta tatgtatgaa cggaaattga 

51451 cggacgagct catcgagatg tttgatgtag gttatgacaa actgcatgat tgcatcacct ttccagtacg 

51521 gaacctcaag ggcgaaacag tattcttcaa ccgtcgaagt gttcgttcta agtttcacca gtacggtgaa 

51591 gatgacccta aaacggaact tctttatggc caatatgagc ttgtagcatt tcgagactat tttgaaaaac 

51661 ctactagtca agtattcgtg actgagtctg ttatcaactg cttgactctt tggtcaatga agattccagc 

51731 agtcgctctt atgggagtag gtggaggaaa tcaaatcaat ttactaaaac gacttcctta tagaaatatt 

51801 gtcctagcac ttgaccctga taacgctggg cagacagcgc aggaaaaacc ctaccgacag ttaaagcgaa 

51871 gcaaggtcgt tagattttcg aactacccta aagagttcta tgataataag cgggatacaa acgaccatcc 

51941 ggaattatta aattttaacg atttagtctt gcagaaattc atttattatc gcataataaa gttagaaaat 

52011 tttaaaaaga ggtcatatca atatgaaaga agcgaataga ctagcttcta gccatgtagg attcgaatgc 

52081 tggactgacg aagaatgtat caggaacttt gaactagacc ctgatatgtc aattgcgtct gcttatcatc *" 

S21S1 gttattttgg gatgctttat tcctatgcaa aaaggtttaa atgcttatct cgacatgaca ttgaaagcat 

52221 tgcattcgag actatttcaa aatgtttggc aacgctcaaa tcaaaccaag gggccaagtt ttcaacttac 

52291 cttacaagac tcttcaagaa tagaacagtc ttagaatata ggtacctaaa tgcaccttcc acgaaccgaa 

52361 attggtatgt agaagtgacg ttcgatagcg tttcgacaaa tgaagaaggc gacgatttta gtatcctatc 

52431 gacagttggc tattgtgaag actacggaaa aattgaaatt gaagcaagtc ttgacttcat gacgctttct 

52501 aatacagagt atgcttatat ctcgtctgtc attcaaaacg gtccttcagc aagcgacgca gaaattgcgc 

52571 gtgaaattgg agtaagcagg tctgctatca gtcagtctaa gaagtcacta aaaaacaaat caaaagattt 
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52641 tatataactg gtttacaaat cacgcgaact 

52711 aaaaacttca aaaatctetc aaccattaaa 

52781 aaaaaccagg aacacttagc tcagggtcca 

52851 aattgtcact ctactgtacg acgacccgga 

52921 gttgacggtc gtcgacgcta tatcaactgc 

52991 accgtccatt atgccaaaac ggattccctc 

53061 gggaaaagtt gaaacatggg accgaggccg 

53131 ggaagccctg tgactcagcc ttttgaaatt 

S3201 aactccttcc agagcgtccg gaagacagcg 

S3271 aactctaatt ttagacctcg acgaagacca 

S3341 gagcgttctt caagtcgttc aaactcacgt 

5341L aatcttcaca aggtcgaaca gctgaaagaa 

53481 aggattctaa catgagggcg cgagccccct 

53551 gactctttgg tgcaaagccc cgttctagca 

53621 gaagcccgca gtcgaggcta cccacacttc 

53691 ctttcaacta ggattcttgg acacgttctt 

S3761 agtatgtaga caaaatgatt gaagacggaa 

53831 tcacgatgag ctggcaggag cctgcttgta 

53901 gttagcaata tgacgaagat gcgaattaag 

S3 971 ggactgtaga ttcaggaatc cctgccatct 

S4041 actcggcgtc aaaatgaatg agccagcgtg 

S41H cctcacagct tgaaaagtcc tcactctaaa 

54161 atgacceatt taaaggaatt ccctttagtt 

542S1 ccctttgcaa accttcgaac tctatgaatt 

54321 gaacataacc eggaaaaagt ctcacgggtt 

54391 acatggaagt ctacggtgtc gacttagacc 

54461 cacgaacgag gccgagcaag agtttcaaca 

54531 caaactaatt tecagagcta tcaaaaactc 

S4601 gccctactca accageaatt ctgttttatg 

S4671 aggaacaggc gaaagtattg tcgagcattt 

S4741 tatgcaaaat cagtctcgac ctatacaaca 

54811 ccacattcaa aeagtacgga gctaagacag 

54881 ttctcgcggt gagggtgcag tagtccgaca 

54.951 gactactctc aacaagaacc tcgttcattg 

55021 aacaaaaccc ggacctatat tcagttatcg 

55091 gttccatccc gacggaacga ctaacaagga 

55161 ggtcttatgt aeggccgcgg ggctaactca 

55231 aggtcattga agatttcttc accgagttcc 

55301 gcaggaeteg ggataegttc aaacagctac 

55371 cacgagttcg agtatatcga cgctagcaag 

55441 agatggacga tactgttcct gaacatatta 

55511 caagaagaag caagaaatta aagacqaggc 

55581 atagctgatg cccagcgcca acgtttgaac 

55651 caatgattaa ggtacacaat gacgctgaac 

55721 tgagttacta ggtgaggttc ctatcaagaa 

55791 gaagcagcca aggacattat tagtcttcca 

55861 aagaaattga aatctaaaac ccacccagtt 

55931 tttatttcga acceetaaat gtgaaaggaa 

56001 cctgcttata aacctaataa gcaagtacga 

56071 ttccttacct cgctgatttg ctttatgcaa 

56141 cttggacaag ccaaaaagca agtgtcttta 

56211 aaacgccctg tctagcaaca tcggttccat 

56281 cctggaattg gaacggctgc atatatagat 

56351 tcttgcagtc aattgcctcg agatatttga 

56421 tgctccgaga tacttgaaaa agcagtcagg 

56491 attcattcat tattat 
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ccgtgtacat catatacgaa aggacaaact ctgaaacctc 
aacccataaa ggagaaccga cacgggaaaa gtatcaatcc 
ataacgagtt ttecacaetc gctgaccacg gtgacagcgc 
aggcgaagac atggactatc ccgtagccca cgaagcagac 
aacgctactg gcgaagacgg ggaaacagcc caccctgata 
gtattgaaaa actatttctt caacctraca accatgatac 
tccctacgtt caaaagattg ccacattcat caataaatat 
attcgttcag gagctaaagg tgaccaacga actacttatg 
ctactcttga agacttccca gaaaagagcg aacttcttgg 
aacgtttgac gtggtegacg gcaagttcac tcttcaagaa 
agaggagcat ctcctgcgcc tagacgaggt tccggtcgag 
ctccttcagt tagccgaaga acccctccaa cacgaggtcg 
ttatcattga ttaagaaagg gaaaataatg gcacaaaaag 
agaagaacga tgcccagcta cttgctcaac ggaaaaacag 
aggaaacgct ctaaaggacg cagttgccag agctcgtact 
gatagactcg agctaatcac tgaggaagca aaactcgagc 
caggttctat cgacgtagaa actgatggac tcgatactat 
ctcacctagt caaaaaggaa tccatgctcc tgtcaatcat 
aatcaaattt ctcctgagtt catgaagaaa atgcttcaac 
atcataattc gaaacctgac atgaaaccga tttattggcg 
ggatacatat ttagccgcaa tgcttttaaa tgaaaacgag 
tacgttagga acgaagaaaa cgcagaggtt gcaaaactta 
taatecctcc tgatgetgcc cacatgxatg cggcctatga 
tcaagaacaa tacttgactc caggaaccga acaatgtgaa 
cttcataata ttgagacgcc tccaattaaa gttctcttcg 
aagataagcc ggcagaaatc agagaacagt ttactgccaa 
gctcgtcagc gaatggcagc ctgaaactga agaaccccga 
gaaatggacg caagaggccg agtgacggca agcatttcca 
atatcatggg actgaaaagt cccgaaaggg ataaacccag 
tgataacgat atctcaaaag cacttcegaa atatagaaaa 
cttgaccaac accttgcaaa gcctgacaat cgaactcaca 
ggcgtatgtc aagtgagaat cctaacttac agaatattcc 
aacccttgca gccagtgaag ggcattacat tattggtagt 
gcggaactaa gtggegacga aagtatgcga cacgcttacg 
gtccgaaact ttatggtgtc ccctatgaag agtgtttaga 
aggaaaacct cgaagaaact ctgtcaagtc cgttctetta 
atcgctgagc agatgaatgc acctgccaaa gaagcgaaca 
ctaaagtggc agactatatc aeaetcgttc aacagcaggc 
cggtcgaaga agaaggctcc ctgatatgag tcttectgaa 
aacgaagacc ccgacccctc taactttgac gcagaccaac 
ccgaaaaata ttgggcccag ctagatagag cctggggatt 
aaaagccgaa ggaattccta etaaggacaa cggaggcaag 
tcagttattc aaggaacggc agccgacatg actaagtacg 
tgaaagaatt aggatcccac tcaatgactc cagttcacga 
cgcaaaacgg ggagcagaaa ggttgacaga agttacgatt 
atgaaatgtg accccagtat agtagaaaga tggtatggcg 
gcatatataa ttctagtagt tactgcgaac cctgcgacaa 
ctttaattcc cccaagcagt tggtttatgg gacccacttc 
gaagccaaaa tttgcaggct ctttgatatg ggtagggtta 
aacctaccac aatcgettgt cgtggcccca ggagttgcat 
tattcgacaa gctctcgaat aaaetagacc cgaagattgc 
catagacgca accatacgga tttcattagg actgagtcct 
attccgtcag ccgtactagg ccaagctcta gttcagttta 
aaaagtagtc aggaaaatxc ctgattacct tgcagtcaat 
aaaattcctg accattcctt ttacaaaaac gctcgacctt 



WO 00/32825 PCT/I B99/02040 

359 
Table 29 



Phage dpi ORFs list 



nb 


Name 


Frame 


Position 


Size 
la.a.) 


Key words 


1 


dp1ORF001 


£ 


jo oy o . . h u j yu 


inn 


r uiauvo tan. 


2 


dp1ORF002 


4 

1 




1 1AO 


Tail- 
I 3II, 


3 


dp1ORF003 


1 

J 


0303O..00or r 


77 Q 

/ 1 y 


una polymerase i. 


4 


dp1ORF004 


*x 
0 


4U4U 1 ..4^44U 


o/y 


Minor Suuciurai, 


5 


dp1ORF005 


1 


23074.. 204 34 


ROC 

Dob 




6 


dp1ORF006 


2 


402yo..4oyo/ 


003 


ClAfl/GKIC L i jlIij ■ i ljij - 

awi/oNp nencase, 


7 


dplORF007 


*j 
J 


«£223U...£30Z1 


403 


Terminase; 


8 


dpIORFOOS 


4 

1 


4yo*4. . ouyo i 


AAC. 


unwj nencase. 


9 




1 

£ 


1 O 1DU.. 1 






10 


dp1ORF010 


i 
£ 


ooyy..yooy 


JOO 




11 


dDlORFOH 


O 


?ftf!17 7QOQR 


039 


iVMJUi ilDdU, 


12 


dplORF012 


1 


coie AZ1 Q 
3040..OH 1 a 


157 


DMA nnl III hAta* 
UW\ POI. Ill OBul. 


13 


dp1ORF013 


3 


lU«£lO.. l I«4U 


J*» I 


um/\ poi. iii yamma ana iau, 


14 


dp1ORF014 


1 
3 




117 
JO' 


UrNM piuila5a t 


15 


dDlORF015 


4 
1 


17Q1 A77A 


O 1 1 




16 


dp1ORF016 


3 


.41.4 4 1 j4jI1i"\1 
4341 3..443U3 


< jO 


Amfdase; 


17 


dp1ORF017 


4 
1 


14.7A7 19flfl1 


770 




18 


dp1ORF018 


3 


ICQ/* 7 1CCQC 

30S47..3DOO0 






19 


dp1ORF019 


2 




ZOO 




20 


dp1ORF020 


1 


1004. .2000 


IRA 
204 


exsD; Coenzyme PQQ; 


21 


dplORF021 


2 


zou4..j^yo 


203 


GTP cydohydrolase; 


22 


dplORF022 


2 


3U09O..3 iO» 0 


ICO 

20a 




23 


dp1ORF023 


2 


04i9..71yo 


200 




24 


dp1ORF025 


-1 


4 O/TOfi 4 BT7fl 
10U2Q.. 10/ f 0 






25 


dp1ORF024 


3 


2oyy2..2o/30 


240 




26 


dp1ORF026 


2 


£ 1012.. 22202 


240 




27 


dp1ORF027 


1 


527b2..534yU 


242 




28 


dp1ORF028 


3 


44595. .45299 


234 


, 


29 


dp1ORF029 


2 


662.. 1348 


228 


exsB; 


30 


dp1ORF031 


3 


26943..2761 1 


222 




31 


dp1ORF030 


-2 


19423.. 200 88 


221 




32 


dp1ORF032 


1 


52033..52647 


204 




33 


dplORF033 


2 


7670.. 8Z39 


4 OA 

169 




34 


dp1ORF035 


-1 


16859.. 17425 


188 




35 


dp1ORF036 


1 


48808.. 49362 


184 


UN Ac replication; 


36 


dp1ORF037 


1 


CCOCE CC7QD 

55855.. 56388 


177 




37 


dp1ORF034 


2 


131 ..652 


4 71 

173 




38 


dp1ORF038 


3 


130V..1U7 1 


4 71 

1 /3 


: : 

exsC; &^yruvoyltetrahydropterin; 


39 


dp1ORF039 


3 


33 0O.. 3003 


4 CC 

"00 


CitruiHne biosynthesis: 


40 


dp1ORF040 


1 


7192..7683 


4 CI 

163 




41 




3 


AAA Q HPHA 

8208..B699 


163 


dUTPase; 


42 


dp1ORF042 


1 


48082. .48561 


159 




43 


dp1ORF043 


1 


31699..32154 


151 




44 


dp1ORF044 


-1 


25211.. 25666 


151 




40 


dp1ORF045 


? 
c 


25340 25777 


145 




46 


dp1ORF046 


3 


42774..43202 


142 




47 


dp1ORF047 


1 


47542..47961 


139 




48 


dp1ORF048 


♦3 


16308.. 16709 


133 




49 


dp1ORF049 


-3 


43620..44018 


132 




50 


dp1ORF050 


3 


15081..15476 


131 




51 


dOlORF051 


2 


29765..30154 


129 




52 


dp1ORF053 


-3 


49917..50300 


127 




53 


dplORF052 


3 


30516..30893 


125 




54 


dp1ORF054 


2 


14423..14800 


125 




55 


dp1ORF055 


3 


27627..28004 


125 




56 


dp1ORF056 


-3 


18780.. 19151 


123 




57 


dp1ORF057 


1 


9859..10218 


119 




58 


dp1ORF058 


3 


15633.. 15989 


118 




59 


dp1ORF059 


1 


30154..30507 


117 




60 




-2 


37717..38070 


117 




61 


dp1ORF062 


-3 


44940.45284 


114 




62 




1 


47200..47541 


113 




63 




2 


291 08.. 29449 


113 
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64 dp1ORF066 


-3 


28566.-28898 


110 




65 dplORF067 


-1 


44735..45061 


103 




66 


dplORF066 


3 


29451. .29768 


105 




67 


dp1ORF069 


•3 


20094..20411 I 105 




68 


dp1ORF061 


-3 


19161. .19475 I 104 




69 


dplORF070 


1 


15973..16284 


103 




70 


dp1ORF071 


3 


38904..39209 


101 




71 


dplORF072 


-2 


50749..51045 


98 




72 


dplORF073 


3 


14262..14555 


97 




73 


dp1ORF074 


3 


32298..32591 


97 




74 


dp1ORF075 


♦1 


221 54-22447 


97 




75 


dp1ORF076 


-1 


5435-5728 


97 




76 


dp1ORF077 


1 


14800..15084 


94 




77 


dp1ORF079 


-3 


35007..35288 


93 




78 


dplORF081 


-3 


5518S..55466 


92 




79 


dDlORF103 


2 


493S2..49627 


91 




80 


dp1ORF080 


1 


42490..42759 


89 




81 


dplORF082 


1 


44728.-44994 


88 




82 


dp1ORF083 


-1 


35720..35974 


84 




83 


dplORF065 


-3 


51246..51497 


83 




84 


dp1ORF085 


-3 


10602..10847 


81 




85 


dp1ORF087 


-2 


29794..30036 


80 




86 


dp1ORF088 


3 


5040..5279 


79 




87 


dplORF089 


-2 


12256.. 12495 


79 




88 


dp10RF273 


3 


56256.-56466 


76 




69 


dp1ORF078 


-3 


17280..17507 


75 




90 


dp1ORF090 


1 


27037..27261 


74 




91 


dp1ORF091 


1 


43189..43413 


74 


Holin; 


92 


dp1ORF092 


3 


46989..47213 


74 




93 


dp1ORF093 


-2 


45538-45756 


72 




94 


dp1ORF095 


3 


8877-9089 


70 




95 


dp1ORF096 


-1 


46469..46681 


70 




96 


dp1ORF097 


-1 


38888..39100 


70 




97 


dplORF098 


1 


43827. .43836 


69 




98 


dp1ORF099 


3 


38298.. 38507 


69 




99 


dp1ORF100 


1 


1597..1803 


68 




100 


dplORF101 


2 


19220..19426 


68 




101 


dp1ORF094 


1 


8281 ..8484 


67 




102 


dp1ORF102 


2 


4034..4237 


67 




103 


dp1ORF104 


♦1 


21224..21427 


67 




104 


dp1ORF105 


-2 


182B..2028 


66 




105 


dplORF106 


-3 


10329..10529 


66 




106 


dp1ORF108 


-1 


49250..49447 


65 




107 


dp1ORF109 


•2 


31435..31632 


65 




108 


dpIORFHO 


1 


16444.. 16638 


64 




109 


dp10RF111 


1 


28657. .28851 


64 




110 


dplORF113 


-2 


17521. .17715 


64 




111 


dplORF084 


1 


15445.. 15636 


63 




112 


dplORF114 


2 


52952..53143 


63 




113 


dp10RF115 


-3 


51 51. .5342 


63 




114 


dp10RF116 


-1 


20474..20662 


62 




115 


dp10RF117 


-3 


24492. .24680 


62 




116 


dp10RF118 


2 


15023..15208 


61 




117 


dp10RF119 


2 


41054..41239 


61 




118 


dp1ORF120 


1 


283S7..28569 


60 




119 


dp10RF121 


3 


39222..39404 


60 




120 


dp10RF122 


-1 


40220..40402 


60 




121 


dp10RF123 


-2 


21145..21327 


60 




122 


dp10RF124 


-3 


17712..17891 


59 




123 


dp10RF125 


-3 


49740..49916 


58 




124 


dp10RF126 


-3 


15960..16136 


58 




125 


dp10RF127 


•3 


13335.. 13511 


58 




126 


dp10RF128 


1 


4B52..5025 


57 




127 


dp10RF129 


2 


251 33.. 25306 


57 




128 


dp1ORF130 


-1 


16619..16789 


56 




129 


dp10RF13l 


1 


43846..44013 


55 




130 


dp10RF132 


-1 


15137..15304 


55 




131 


dplORF133 


-2 


7900..8061 


53 




132 


dp10RF135 


3 


780..936 


52 




133 


dp10RF136 


-1 


55094-55252 


.52 




134 


dp10RF137 


-2 




52 
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135 


dplORF138 


3 


30504..30662 I 52 




136 


_dplORF139 


-3 


11 934.. 12092 


52 




137 


dplORF140 


3 


20562..20717 


51 




138 


dp10RF141 


.1 


42767-42922 


51 




139 


dplORF142 


-3 


31743..31898 


51 




140 


dplORFU3 


-3 


7410-7565 


51 




141 


dplORF144 


1 


36517..36669 


50 




142 


dplORF145 


1 


42067..42219 


50 




143 


dplORF146 


1 


51484..51636 


50 




144 


dplORF147 


1 


55207..55359 


50 




145 


dplORF148 


-1 


28484..2S636 


50 




146 


dplORF150 


-3 


15033..15185 


50 




147 


dplORFl34 


-2 


349..49S 


49 




148 


dplORF151 


1 


28027..28176 


49 




149 


dplORF152 


1 


42235.-42384 


49 




150 


dplORF153 


2 


22307..22456 


49 




151 


dplORF086 


2 


52760..52906 


48 




152 


dp10RF154 


2 


18446-18592 


L 




153 


dplORF155 


3 


13512-13656 


48 




154 


dp10RF156 


3 


18777.. 18923 


46 




155 


dplORF157 


-2 


131 35..1 3281 


48 




156 


dplORF158 


-3 


40581..40727 


48 




157 


dp10RF159 


-3 


30225..30371 


48 




158 


dp10RF149 


-3 


26331. .26474 


47 




159 


dp1ORF160 


2 


41324..41467 


47 




160 


dplORFl81 


2 


52175..52318 


47 




161 


dp10RF162 


3 


13020..13163 


47 




162 


dp10RF!63 


3 


40224..40367 


47 




163 


dp10RF164 


-2 


6S53..6696 


47 




164 


dp10RFl65 


-3 


50361. .50504 


47 




165 


dp10RFl66 


-3 


23376..23519 


47 




165 


dp10RF167 


3 


1008..1148 


46 




167 


dp10RF168 


-2 


54205..54345 


46 




168 


dp10RF!69 


-2 


45814.. 45954 


46 




169 


dp1ORF170 


-2 


27460..27600 


46 




170 


dp10RF171 


-3 


47538..47678 


48 




171 


dp10RF172 


t -1 


10325..10462 


45 




172 


dp10RF173 


-2 


32023..32160 


45 




173 


dp10RF174 


-2 


29629-29766 


45 




174 


dp10RF175 


-2 


15511..15648 


45 




175 


dp10RF176 


-3 


42894. .43031 


45 




175 


dp10RF177 


-3 


19800.. 19937 


45 




177 


dp10RF178 


♦3 


11787..11924 


45 




178 


dp10RF112 


2 


32207..32341 


44 




179 


dp10RF179 


3 


56058..56192 


44 




180 


dp1ORF180 


-1 


41042..41176 


44 




181 


dp10RF181 


-1 


12992..13126 


44 




182 


dp10RF182 


-2 


45235-45369 


44 




183 


dp10RF!83 


-2 


13762..13B96 


44 




184 


dp10RF184 


-3 


53196..53330 


44 




185 


dp10RF185 


1 


22522.. 22653 


43 




166 


dPlORF186 


2 


21272. .21403 


43 




187 


dp10RF187 


2 


34415..34546 


43 




188 


dp10RF188 


2 


35609..35740 


43 




189 


dp10RF189 


2 


42587..42718 


43 




190 


dp1ORF190 


3 


39786..39917 


43 




191 


dp10RF191 


-1 


40865..40996 


43 




192 


dp10RF192 


-1 


2789-2920 


43 




193 


dp10RF193 


•2 


42325..424S6 


43 




194 


dp10RF194 


-2 


40153..40284 


43 




195 


dp10RF195 


-3 


42453-42584 


43 




196 


dp10RF196 


-3 


11U2..11273 


43 




197 


dp1ORF107 


1 


10750.. 10878 


42 




198 


dp10RF197 


2 


7484..7612 


42 




199 


dp10RF198 


2 


24 11 9..24247 


42 




200 


dp10RF199 


-1 


15614..15742 


42 




201 


dp1ORF200 


-3 


47715-47843 


42 




202 


dp1ORF201 


1 


3S569..38694 


41 




203 


dp1ORF202 


2 


44483.. 44608 


41 




204 


dp1ORF203 


-3 


22656..22781 


41 




205 


dDlORF204 


1 


1471..1593 


40 
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206 


dp1ORF205 I 1 


8524..S646 I 40 




207 


dDlORF206 


1 


19855..19977 I 40 




208 


du1ORF207 


\ 


27502 27624 


40 




209 


dr>1fJRF208 


2 


47279..47401 


40 




210 


do1ORF209 


3 


29784 29906 


40 




211 


dn1DRF210 

up i^nri 1 \J 


-1 


52955 53077 


40 




212 


up 1 Unri ( I 


_1 


20837 20959 


40 




213 




-2 


52861 52Q83 


40 

•+U 




214 


HninRP913 


-2 


30169 30291 


40 




215 


Hn10RF?1A 
up i \jr\i£. i *♦ 


.2 


24151 24273 


40 




216 


rtninRP915 


-3 


35700 35822 


40 




217 


HrvinRF91R 


-3 


32727 32849 


40 




218 


UP I URri 1 ' 




23443 23562 


39 




219 


UP 1 uiv £, 1 0 


3 


22029 22148 


39 




220 


up I ^Jr\r £ ig 


•1 


51269 51388 


39 




221 


rin1CiRP99n 
UP (UiAr^U 


.\ 


6915 6334 


39 




222 




\ 


47607 43623 


38 




223 


HninQP999 


3 


13912 1332B 


38 




224 


□P lUnriiO 


3 


14065 14171 


38 




225 


Hn1fiRF224 
up lUnrii* 


.-j 


13505 13621 


38 




226 


rininRF525 
up i ursiiij 


-2 


32875 32991 


38 




227 


HninRF226 


♦2 


5*5075 25191 


38 




228 


rtn1*~iRP997 


.2 


22999 23115 


38 




229 


rfninRF22R 


1 


10450 10563 


37 




230 


Up 1 UiAr*t3 




27634 27747 


37 




231 


Hn1i"iRF93n 
up I UrSriOU 


2 


50723 50836 


37 




232 


HnlOR F931 
up \ \ji\r&.& \ 


t. 


30958 31071 


37 




233 


up \ unriJt 


-2 


29272 29385 


37 




234 


HninRP937 
up lunrtjj 


o 


5977Q 69892 


37 




235 


dp10RF234 


l 


36963 36363 


38 




236 


dp10RF235 


5 




JO 




237 


dp10RF236 


* i 


37418 37528 

Jit 1 O., Jf J4V 


36 




238 


dplORF237 






3fi 




239 


dp10RF238 




1101 1^01 


3R 




240 


dp10RF239 


4 
1 




3R 




241 


dPlORF240 


4 
1 


iii flat don/vi 


<S9 




242 


dp10RF241 


4 

♦1 


ARQ13 47090 


35 




243 


dp!ORF242 


4 

"1 


41931 41338 


35 




244 


dp10RF243 


_9 

-z 


ail yy..D i ouo 


3R 




245 


dp10RF244 


-J 


20if / O. .2 /VoJ 


JO 




246 


dp10RF245 




01 '1.. 02/0 


JO 




247 


dp10RF246 


-J 


£14A, .2001 






1 1 1 Ml i twumm 


4 

1 


2»04i..2»f40 


J4 




249 


dPlORF248 


1 


3 J SOU.. OO DO* 


*ij4 
J4 




250 


dp10RF249 


2 


in41 one 
2D12..2 1 10 


o4 




251 


dp1ORF250 




ZOOO/ ..2jy41 


til 




252 


dplORF251 


4 

-1 


391U1..O02UO 






253 


dp10RF252 


"2 


c>c£7 KA771 






254 


dplORF253 


•a 
*o 


00 131 ..30200 


"J 4 

J4 




255 


dplORF254 


«» 
-o 




34 




256 


dplORF255 


-o 




T4 




257 


dplORF256 


4 
1 


4MOQ 4C3Qfl 


33 
■SO 




258 


dp10RF257 


1 

I 


9fl91fi 9A317 


o<9 




259 


dp10RF258 


4 
1 


44023..44124 


33 




260 


dp10RF259 


2 


4298..4399 


33 




261 


dp1ORF260 


2 


24746..24847 


33 




■ Ml EE 1 1 M 


3 


288..389 


33 




263 


dp10RF262 


3 


9408..9509 


33 




264 


dplORF263 


-1 


26951.. 27052 


33 




265 


dplORF264 


-1 


6038.. 61 39 


33 




266 


dplORF265 


•1 


4700..4801 


33 




267 


dp10RF266 


-2 


50119..50220 


33 




268 


dplORF267 


-2 


47266.-47367 


33 




269 


dp10RF268 


-2 


12520..12621 


33 




270 


dplORF269 


-3 


53733..53834 


33 




271 


dp1ORF270 


-3 


50691..50792 


33 




272 


dp10RF271 


-3 


19638..19739 


33 




273 


dp10RF272 


-3 


1455..1556 


33 
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Table 30 

Predicted Dp-1 amino acid sequences 

dplORFOOl 

36698 atqattgacaacaatttacctatgagtccaattcctggcgaaattgctcaagtatacgaccaaaaccccaatctaattggagca 

1 M IDNNLPMSPI PGEIVQVYDQNFNLIGA 

36782 aqtgatgaaacctttagcaagcatcacgaagacgaaatcgtgactcgagcccgaggaaaagaaaccttcacctttgaaagcatt 

29 s deIPSKHYEDBIVTRARGKBTPTPESI 

36866 gaaacctcatccatctatcaacactcaaaggttgaaaacattatccagtatggaggaagacggtttcgaattaaacacgcccag 

57 i TS SIYQHLKVENIIQYGGRWFRIKYAQ 

36950 qacgcagaagatgtcaaagggcttaccaagtttacctgccacgcactacggtatgaactagcagaaggcttgcctaggaagttg 

85 DVEDVKGLTKFTCYALWYELAEGLPRKL 

37034 aaacacgtcgcttcttctgtaggcgctgtcgcgctagatattatcaaagacgcaggtgaatgggctcgaccagcttgtcctcct 

U3 KHVASSVGAVALDIIKDAGEWVRLVCPP 

37118 gacqgtqctaacaaacaagttcgaagcataacagccgcagaaaattcaatgctttggcatctccgatatcctgcaaagcaatac 

141 DGANKQVRSI TAAENSMLWHLRYLiAKQY 

37202 aattcagaactgacatttggtcacgaagaaattatcaagcaagaggttagaattgtccaaaccgttgtatctcttcagcctcat 

169 NLEL TPGYEEI IKQEVRIVQTVVFLQPY 

37286 gtcgagtctaaagtagactttcctcttgcagttgaagagaatttgaaacatgtcactaggcaggaagactctcgaaacccgtgt 

197 V B S KVDFPI»VVEEMLKYVTRQEDSRNLC 

37370 acggcttacaagttgacaggtaaaaaggaagaaggcagccaagagcctccaacgtctgcttctaccaacaacggaagtgaacat 

22S TAYKLTGKKEEGSQEPLT FAS INNGSBY 

37454 cccattgatgtttcgtggtctactacacgccacatgaagcctcgatatattgctaaatctaaaagcgacgaacactttagaatt 

253 LIDVSWFTTRHMKPRYI .AKSKSDEHFRI 

37538 aaagaaaatttgatgagtgctgcgcgtgcttatctcgacatctacagccgcccactaattggatacgaggcttcagcggrcctt 

2B1 KENLMSAARAYLDIYSRPLIGYBASAVL 

37622 cataacaaggttcctgacttgcaccatactcaactaattgccgacgaccattatgatgttatcgagtggcgaaagatatctgct 

309 YHKV PDLHHTQLI VDDHYDVIEWRKISA 

37706 cqaaaaattgactacgacgaccttccaaactctactatcattttccaagaccctcgaaaagacctgatggacctgccaaacgag 

337 R jciOYDDLSNSTIIFQDPRKDLMDLLNE 

37790 oacggcgaaggagtcctttcaggggaaactgtaaatgagtcccaagttgttatcagatacgcagacgacattttagggactaat 

365 SgEGVLSGBTVMESQVVIRYADDILGTN 

37874 trtaatgcagaacccgggaaacacactggtgtccttaatactaataagaaaccgagcgaattagttcctgacgactttacatgg 

393 pnaIsG KYIGVLNTNKKPSBLVPDDPTW 

37958 attcgactagaaggtcccaaaggtgacgcaggtttaccgggagctcctgggcgtgatggagtcgacggtgtacctggaaagagc 

421 IRLB gpkgcaglpgapgrdgvdgvpgks 

38042 ggagtagggatagcagacacagctatcacttatgccgtatccgtttccggaacgcaagagcctgaaaatggatggagcgaacaa 

449 GVGIADTAITYAVSVSGTQEPENGMSEQ 

38126 qttcctgaactcataaaaggtcgactcctgcggaccaaaacattttggagatatactgacggctcacatgaaactggatactcc 

477 VPELIKGRFLWTKTPWRYTDGSHETGYS 

38210 gttgcccatatagggcaagacggaaattccggaaaagacggaatcgcaggtaaggacggagtaggtatagccgcaactgaagtc 

505 VAYIGQDGNSGKDGIAGKOGVG IAATBV 

38294 atgtatgcaagctcgccatccgctaccgaagccccagctggtggatggtctacgcaagttcctaccgtcccaggtggccagtat 

533 M YASSPSATEAPAGGWSTQVPTVPGGQY 

38378 ctatggact cgaacaagatggcgct acactgaccaaac tgatgaaat t ggat a t c cagt c t caagaatgggcgagcagggt cc t 

561 LWTRTRWRYTDQTDBIGYSVSRMGBQGP 

38462 aaaggtgacgcaggtcgcgacggrattgcaggaaagaacggaatagggttgaagccaacttcagtttcttatggaattagtccc 

589 KG DAGRDGIA. GKNGIGLKSTSVSYGISP 

38546 actgattctgcgattcctggagtatgggcttcacaagttccttccttaatcaaaggtcaatatcttcggactcgaactatttgg 

617 T 5 SAIPG VWASQVPSLIKGQYLWTRT1W 

38630 acctataccgattcaactaccgaaacgggctatcaaaaaacctacatcccaaaagacgggaatgacggtaaaaatggaattgct 

645 TYTDSTTETGYQKTYI PKDGNDGKNGIA 

38798 acttctgctattccaaacgttcaaccgggattcttctcgcggacgaaaactgcctggaaccatactgatgacactagcgaaaca 

701 TSAIPMVQPGFFLWTKTVWNYTDDTSET 

38882 ggttactcagttcccaagataggtgaaacaggtcctagaggagttcaaggtcttcaaggtcctcaagggcttcaaggaacccct 

729 GYSVSKIGETGPRGVQGLQGPQGLQGI P 

38966 ggacctgcaggagctgacggacgttcgcaatatacccacctcgcctcctccaatagtccaaacggcgagggacctagtcatacc 

757 GPAGADGRSQYTHLAPSNSPNGEGFSHT 

39050 qacagcggacgagcatacgtcggtcagtaccaagatttcaatcccgtccactcaaaagaccctgcagcctatacacggacgaaa 

785 DSG R AYVGQYQDFNPVHSKDPAAYTWTK 

39134 tggaaggggaatgacggagcteaagggacacccgggaagccaggcgcagacggtaagactaattatctccatatagcttacgct 

8 13 WKGNDGAQGI PGKPGADGKTNYPHIAYA 

39219 ccaaqtgcagacggatcacgtgagttcagtttggaagataacaatcaacaatatacgggttatcactccgattafcgagcaagca 

841 SS ADGSREFSLEDMNQQYMGYYSDY-E-QA 

39302 gacagcagggaccgaactaagtaccgatggtctgaccgccttgccaacgcccaagtgggaggtcgaaacgagtcccttaattct 

869 DSRDRTKYRWFDRLANVQVGGRNEFLNS 

39386 ccatccgaatttggtttaaaaccccgctattctagtcacaacctaatggacggacaagaccaaacgcaaggacagacatctgct 

897 LPE FG LKPRYSS YMLMDGQDQ TQGQ X S A 

39470 actattgacgaacgtcaacggttcaaaggtgctaactctttacgacctgacccaacatggaacggtaaaccgcagaaccaaaaa 

925 T IDE RQRFKGANSLRLDSTWNGKPQNQK 
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39554 ctgaccttttcttcaggaggagatacgcgattaggtactccaaccgagtggtctaatttagaaggccgtatcagtttctgggct 

9S3 LTFSLGGDTRLGTPTEWSNLEGRISFWA 

39638 aaggcctctaggaacggagtgagcttagctgcacggccgggttatcgtagtaacgtatttaccgcaaccttaaccgatcaatgg 

981 KASRNGVSLAARPGYRSNVFTATLTDQW 

39722 aagttctacgattttaaattctttgacaaagttaattcaaattgtaccgctgaagcaattttccatgtattcactcaaagttgt 

1009 KFYDFKFFDKVNSNCTAEAI FHVFTQSC 

39806 ccagtgtggctcaatcatattaaaatcgaacttggtaatatctctactcctttcagtgaagcagaggaagaccttaaatatcga 

1037 SVWLNKIKIELGNISTPFSEAEEDLKYR 

39890 attgactcaaaagccgateaaaagctaactaaccaacagttgacggcactcacggaaaaggctcaactacatgacgeagaactg 

1065 IDSKADQKLTNQQLTALTEXAQLHDAEL 

39974 aaagctaaggctacaatggagcagttaagtaacttagaaaaggcttatgaaggtagaatgaaagctaacgaagaagctatcaaa 

1093 KAKATMEQLSNLEKAYECRMKANEEAIK 

40058 aaatcggaagccgacctaatcttagcggcaagtcgaattgaagctactatccaagaacttggcgggctacgggaactgaagaag 

1121 KSEADLILAASRI EATIQEL GG LRELKK 

40142 ttcgtcgacagttacatgagctcttctaatgaaggtctaattatcggtaagaacgacggtagctctaccattaaggtatcaagt 

1149 FVDSYMSSSNEGLI IGKMDGSSTIKVSS 

40226 gaccgaatttctatgttctccgcagggaatgaagttatgtaccttacgcaagggttcattcacatcgataacgggatctttacc 

1177 DRISMFSAGNEVMYLTQGFIHIDNGIFT 

40310 caatccattcaagtcggccgatttagaacggaacaatactcgtttaatccagacatgaacgtgattcggtatgtaggataa 
40390 

1205 QSIQVGRFRTEQYSFNPDMNV1 R Y V G * 
dplORF002 

32386 atggattttgggtcaattgcagcaaaaatgactttggatatctcaaacttcacaagtcaattaaatcttgctcaaagtcaagcg 

1 MDFGSIAAKMTLDISNFTSQLNLAQSQA 

32470 caacggctcgcaetagagtcttcgaagtcctttcaaattggttctgctttaacaggattagggaaaggacttacgactgcggtt 

29 QRLALESSKSFQIGSALTGLGKGLTTAV 

32S54 acccttcctcttatgggatttgcagccgcctctattaaagtagggaatgaattccaagctcaaatgtcccgtgttcaagctatt 

57 TLPLMGFAAASIKVGNEFQAQMSRVQAI 

32638 gcaggagcgacagcggaagagcttggcagaatgaagactcaagcaatcgaccttggcgctaaaactgcttttagtgcaaaagag 

85 AGATHA EELGRMKTQAZ DLGAKTAPSAKB 

32722 gcggctcaaggtatggaaaatctagcttcagccggtttccaggtaaatgaaatcatggacgctatgccaggggtacttgacctg 

113 AAQGMENLASAGFQVNBXHDAMPGVLDL 

32806 gctgccgtatctggaggagatgtggccgcgagctccgaggccatggctagttcacttcgagcctttggattagaggcaaaccag 

141 AAVSGGDVAAS SBAMASS LRA FG L EANQ 

32890 gcgggtcacgtggctgacgtatttgctcgagcagcagctgatacgaacgcagaaactagcgacatggcagaggcgatgaaatac 

169 AGHVADVFARAAADTMAETSDMABAMKY 

32974 gtcgcacccgttgctcactctatgggcttgagccttgaagaaacggctgcgtctattgggattatggccgaegccggtattaag 

197 VAPVAHSMGLSLEETAASZGZMAOAGIK 

33058 ggctcgcaagccggaaccacgcttagaggcgctctctcgcgtattgccaaacctacgaaagcgatggtcaaatcaatgcaggaa 

225 GSQAGTTLRGALS R XAKPTKAMVKSMQB 

33142 1 1 aggagt 1 1 egt t ctacgacgcgaacggaaacatgat t ccact aagagaacaaat cgc t caact gaaaacagct ac t gcagga 

253 LGVSFYDANGNNI PLREQXAQLKTATAG 

33226 ctaacacaagaggaacgaaatcgtcaccttgttaccttgtatggccaaaactcgttgtcaggtatgcttgcactattagacgca 

281 LTQE8RNRHLVTLYGQHSLSGMLALLDA 

33310 ggtcctgagaaattggataagatgaccaatgctctcgtgaactcggacggagctgccaaggaaatggcagaaactatgcaggac 

309 G PEKLDKMTNALVNS DGAAKEMAETMQD 

333 94 aaccttgetagtaaaatcgagcaaatgggaggagctttcgagtctgttgetattattgttcaacaaatccttgagcctgcactt 

337 NLASKIEQMGGAFESVAIIVQQILEPAL 

33478 gc t aaaat cgt gggagcaatcacaaaagcc ct cgaagcat t cgt aaat at gt cacct at cggt c aaaagat ggt t gt ca t at tc 

365 AK IVGAITKVIiEAFVMMSPIGQKMVVlF 

33562 gcaggaatggt tgcagccct tggaccact get t c taat tgc aggaac ggt gatgacaact a t tgt caagt taagaat t get at t 

393 AGMVAALGPLLLIAGMVMTTI VKLRIAI 

33646 cagtttttaggtccagcatttatgggaacgatgggaaccattgcaggagttatagcaatattctatgctctggtcgccgtgttc 

421 QFLGPAPMGTMGTIAGVIAIFYALVAVF 

33730 atgatagcctacacaaaatcggagagatttagaaactttatcaacagtcttgcgcctgctattaaagetgggtttggaggagcg 

449 MIAYTKSBRFRNFINSLAPAZ KAGFGGA 

33814 ttggaatggctacttccacgactgaaagagttaggagaatggttacagaaggcaggcgagaaggcgaaagagttcggtcagtet 

477 tBWLLPRLKBLGEWLQKAGEKAKEFGQS 

33898 gtagggtctaaagtgtcaaaactgctcgaacagtttggaataagtateggtcaggcaggaggctcgattggtcagttcattgga 

505 VGSKVSKLLEQFGIS ZGQAGGS IGQFIG 

33982 aatgttetcgaaaggctaggaggcgcatttggaaaagtaggaggagtcatttcaattgctgtttcacttgtaacaaaattcggt 

533 NVLERLGGAFGKVGGVZSZAVSLVTKFG 

34066 ctcgcatttctagggattacaggaccactcgggattgctattagtctgttagtttcatttttgacagcttgggctagaacaggt 

561 LAFLGZTGPLGZAISLLVSFLTAHARTG 

34150 gagt t caacgcagacggaattact caagt at tcgaaaac 1 1 gacaaacacaat t cagt egaegge tgat 1 1 cat ct ct caat ac 

589 EFNADGZTQVFENLTNTZQSTADFZSQY 

34234 cttccagtctttgtcgaaaaaggaactcaaattttagttaagattattgaaggaattgcatctgctgttcctcaagtagttgaa 

617 LPVFVBKGTQILVKI ZEGIASAVPQVVE 

34318 gt gat 1 1 cacaagt cat tgaaaat at tgtgatgacaa 1 1 1 cgac agt t atgcctcaat t agt cgaagcaggaat t aagacaete 

645 VISQVZENIVMTXSTVMPQLVEAGZ-K-ZL 

34402 gaagcgcttataaatggtcttgtccaatctcttcctactatcattcaagcagctgttcaaattatcactgctltattcaatggt 

673 EALZNGLVQSLPTI IQAAVQZ ZTALFNG 

34486 cttgttcaggcacttcctacgcttattcaagcaggtcttcaaattttgtcagctctcataaacggactagttcaagcgcttccg 

701 LVQALPTLZQAGLQZ LSALZNGLVQALP 

34S70 gcaattattcaagcagctgttcaaattatcatgtcgcttgttcaagcactaattgaaaacttgcctatgataatcgaagcagcg 

729 AZ IQAAVQl'lMSLVQALI ENL PMZ ZEAA 
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34654 atgcagattacaatgggtctagtcaacgcactgatcgaaaatataggacctatcttagaagcagggactcaaactctaatggct 

757 MQIIMGLVNALIENIGPILEAGIQILMA 

34 73 9 tcaaccgagggacccatccaagtgcctcctgaactaaccacagcagcgatccaaaccactacttcaccaccagaagcaatcccg 

785 LIEGLIQVLPELITAAIQI ITSLLEAIL, 

34 822 tcgaaccttcctcaacttctagaagccggagttaaattgcttttatcacttcttcaagggttgctaaatatgcttcctcaacta 

813 5NLPQLLEAGVKLLLSLLQGLLNMLPQL 

3 4 906 attgcaggggctttgcaaatcatgatggcacttcttaaagcagttategacttcgtccctaaacttcttcaagcaggtgttcaa 

841 IAGALQIMMALLKAVIDFVPKLLQAGVO 

34 990 cttccnaaggcatcgattcaaggtattgcctcacctctcggctcacttttatcgacagctggaaacatgctttcaccattagct 

869 LLKALIQGIASLLGSLLSTAGNMLSSLV 

35074 agcaagattgctagctttgtgggacagatggtttcaggaggtgcgaacctgattcgaaacttcattagtggtattgggtcaatg 

897 SKIASFVGQMVSGGANLIRNFISGIGSH 

35158 attggttcagctgtctctaaaattggcagcatgggaacttcaattgtttctaaggttactggattcgctggacaaatggtaagc 

925 IGSAVSKIGSMGTSIVSKVTGFAGQMVS 

35242 gcaggggtcaacctegttcgaggatttatcaatggtateagttccatggtaagttctgcggtaagtgcggcggctaatatggct 

953 AGVNLVRGFINGISSMVSSAVSAAANMA 

35326 agcagtgcattaaatgccgttaagggattcttaggtatteactctccttcacgtgtcatggagcagatgggtatctatacgggt 

9B1 SSALMAVKGFLGIHSPSRVMEQMGIYTG 

35410 caagggttcgtaaatggtattggtaacatgattcgaactacacgtgacaaggctaaagaaatggctgaaactgttactgaagct 

1009 QGFVNGIGNMIRTTRDKAKEMAETVTBA 

35494 ctcagcgacgtgaagatggatattcaagaaaatggagttatagaaaaggttaaatcagtttacgaaaagatggctgaccaactt 

1037 LSDVKMDIQENGVIEKVKSVYEKMADQL 

35578 cctgaaactcttccagetcctgatttcgaagatgttcgtaaagcagccggttcgcctcgagtggacttgttcaatacaggaagt 

1065 PBTLPA PDFEDVRKAAGS PRVDLFNTGS 

35662 gacaaccccaaccaaccrcagtcacaaectaaaaacaaccaaggcgagcaaaccgctgtcaacaccggaacaatcgtagtccga 

1093 QMPNQPQSQSKNNQGEQTVVNIGTIVVR 

35746 aacaatgacgacgttgacaaactgtcgagaggattgtataatagaagtaaagaaactctatcagggtttggtaacattgtaaca 

1121 NMDDVDKLSRGLYNRSKETLSGFGNIVT 

35830 ccgtaa 35635 

1149 P * 
dplORTOOS 

53538 atggcacaaaaaggactctttggtgcaaagcctegttctagcaagaagaacgatgctcagttacttgctcaacggaaaaacagg 

1 MAQKGLFGAKPRSSKKNDAQLLAQRXMR 

53622 aagcctgcagttgaggttacttacatttcaggaaacgctctaaaggacgcagttgctagagctcgtaccctttcaaccaggatt 

29 KPAVEVTYISGMALKDAVARARTLSTRI 

53706 c t tggacacgt t c t tgat agac t tgagt c aat cact gaggaagcaaaact cgagcagt atgtagacaaaatgat tgaagacgga 

57 LGHVLORLELXTBEAKLBQYVDKMIEDG 

53790 acaggctccattgacgtagaaactgatggactcgatactattcacgatgagctggcaggagtctgcctgtactcacccagtcaa 

85 IGSIOVETDGLDTIHDBLAGVCLYSPSQ 

53874 aaaggaatctatgctcctgtcaatcatgttagcaatatgacgaagatgcgaattaagaatcaaatttctcctgagttcatgaag 

113 KGIYAPVHHVSNMTKMRIKNQISPBFMK 

53958 aaaatgcetcaacggattgtagattcaggaattcctgtcatctatcataattcgaaacttgacacgaaatcgatctactggcga 

14 1 KMLQRIVDSGIPVIYHNSKFDMKSIYWR 

54 042 cccggcgccaaaacgaatgagccagcgtgggatacacatttagccgcaacgcttccaaatgaaaacgagtctcacagcttgaaa 

169 LGVKMMEPAWOTYLAAMLLMENESHSLK 

54126 agtcttcactctaaatatgttaggaacgaagaaaacgcagaggttgcaaaatttaatgacttatttaaaggaattccttttagt 

197 SLHSKYVRNEENAEVAKPNDLFKGIPFS 

54210 ctaattcctcctgatgttgectatatgtatgcggcctatgaccctttgcaaactttcgaactctatgaatttcaagaacaatac 

225 LI PPDVAYMYAAYDPLQTFELYEFQEQY 

54294 ttgactccaggaactgaacaatgtgaagaatataacctggaaaaagtctcatgggttctccataataccgagatgcctctaatc 

253 LTPGTBQCEEYNLBKVSWVLHNI BMPLZ 

54378 aaagttctcttcgacatggaagtctaeggtgtcgacttagaccaagataagctggcagaaattagagaacagtttactgccaat 

281 KVLFOMBVYGVDLDQOKLAEI REQFTAN 

54462 atgaacgaggctgagcaagagtttcaacagcttgtcagcgaatggcagcctgaaattgaagaacttcgacaaactaatttccag 

309 MHBAEQBFQQbVSEWQPBI EELRQTNFQ 

54 546 agccatcaaaaactcgaaatggacgcaagaggtcgagcgacggtaagcatttccagtcccactcaattagcaatcctgttttat 

337 SYQKLBMDARGRVTVSISSPTQLAILFY 

54630 gat at ca tgggat tgaaaagt cctgaaagggat aaacc t agaggaacaggcgaaagt at tgt cgagcatt tt gat aacgat at c 

365 DIMGLKSPERDKPRGTGBS IVEHFDNDX 

54714 tcaaaagcacttttgaaatatagaaaatatgcaaaattagtttcgacctatacaacacttgaccaacaccttgcaaagcctgac 

393 SKALLKYRKYAKLVSTYTTLDQHLAKPD 

54798 aatcgaattcacactacattcaaacagtacggagctaagacagggcgtatgtcaagtgagaatcctaacttacagaatattcct 

421 MRIHTTFKQYGAKTGRHSS ENPNLQNIP 

54682 t c t cgcggtgagggt gcagtag 1 1 cgacaaat ctt tgcagccagtgaagggcat t aca 1 1 at t ggt agt gac tact ct caacaa 

449 SRGEGAVVRQIFAASBGHYI IGSDYSQQ 

54 966 gaacctcgttcattggcggaattaagtggcgacgaaagtatgcgacatgcttacgaacaaaacctggacctatattcagttatc 

477 EPRSLAELSGDESMRHAYBONLDLYSVI 

55050 ggttcgaaactttatggtgttccctatgaagagtgtttagagttctatcccgacggaacgactaacaaggaaggaaaacttcga 

S05 GS KLYGVPYEECLE FY PDGTTM K -8 G JL 4. ft* 

55134 agaaattctgtcaagtccgttcttttaggtcttatgtacggccgcggggctaactcaatcgctgagcagatgaatgtatctgtc 

533 RNSVKSVLLGLM'YGRGANS IAEQMNVSV 

55218 aaagaagcgaataaggttattgaagatttcttcaccgagttccctaaagtggcagactatatcatattcgttcaacagcaggcg 

561 KEANKVZ EDFFTEFPKVADYI I FVQQQA 

55302 caggacttgggatatgttcaaacagctaccggtcgaagaagaaggcttcctgatatgagtcttcctgaatacgagttcgagtat 

589 QDLGYVQTATGRRRRLPDMSLPEYEFEY 

55386 atcgacgctagcaagaacgaagatttcgacccctttaactttgacgcagaccaacagatggacgatactgttcctgaacatatt 
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617 IDASKNEDFDPFNFDADQQMDDTVPEHI 

55470 atcgaaaaatattgggcccagctagatagagcctggggatttiaagaagaagcaagaaactaaagaccaggcaaaagccgaagga 

645 IEKYWAQLDRAWGPKKKQEIKDQAKAEG 

555S4 attcttattaaggacaacggaggcaagacagctgatgctcagcgccaatgtttgaactcagttattcaaggaacggcagccgac 

673 I LI KDNGGKIADAQRQCLNSVI QGTAAD 

55638 atgactaagtacgcaatgattaaggtacacaatgacgccgaattgaaagaattaggattccatttaatgatcccagttcacgat 

701 MTKYAMIKVHNDAELKELGFHLMIPVHD 

55722 gagttactaggtgaggttcctatcaagaacgcaaaacggggagcagaaaggttgacagaagttatgattgaagcagccaaggac 

729 ELLGBVP I KNAKRGABRLTEVM IEAAKD 

55806 attattagtcttccaatgaaatgtgaccccagtatagtagaaagatggtatggtgaagaaatcgaaatctaa 55877 

757 IISLPMXCDPSXVERWYGBBIEI* 
dplO8T004 

40401 atgacaaaatttatcaactcatacggccctctecacttgaacctttacgtcgaacaagttagtcaggacgtaacgaacaactcc 

1 MTKFINSYGPLHLNLYVEQVSQDVTNNS 

40485 tcgcgagttagttggcgagctactgtcgaccgcgatggagcttatcgaacgtggacttatggaaatattagraacctttccgta 

29 SRVSWRATVDRDGAYRTWTYGN ISNLSV 

40569 tggtcaaatggttcaagtgttcatagcagtcacccagactacgacacgtccggcgaagaggtaacgctcgcaagtggagaagtg 

57 WLNGSSVHSSKPDYDTSGEEVTLASGEV 

40653 actgttcctcacaatagtgacgggacaaagacaatgtccgtttgggcttcgtttgaccctaataacggcgttcacggaaatatc 

85 TVPHNSDOTKTMSVWASFDPNNGVHGMI 

40737 actatctctactaattacactttagacagtattccaaggtctacacagatttctagttttgagggaaatcgaaacctaggatct 

113 TISTNYTLDSIPRSTQISSFBGNRNLGS 

40821 ttacatacggttatcttcaaccgaaaagtgaactctcttacgcatcaagtttggcaccgagttttcggtagcgactggatagat 

141 LHTVI FNRKVNSFTHQVWYRVFGSDWID 

40905 ttaggtaagaaceatactactagcgtatcctttacgccgtcactggacttagcaaggtacttacctaaatcaagttccggaaca 

169 LGKNHTTSVSFTPSLDLARYLPKSSSGT 

40989 at ggacat c tgtat tcgaacct at aacggaact acgcaaat tggt agtgacgte tat t caaacggatggaggt t caacat ccc c 

197 MDICIRTYNGTTQIGSDVYSNGWRFNIP 

41073 gactcagcacgccccactccttcgggcatctccttagtagacacgacttcagcggttcgacagattttaacagggaacaacctc 

225 DSVRPTFSGISLVDTTSAVRQILTGNNF 

41157 ctccaaatcatgtcgaacattcaagccaacctcaacaacgctcccggcgcttacggatccactatccaagcatttcacgctgag 

253 LQIMSNIQVNFNNASGAYGSTIQAFHAB 

41241 ctcgtaggtaaaaaccaagctatcaacgaaaacggcggcaaattgggtatgatgaactttaatggctccgctaccgtaagagca 

281 LVG KNQA I NEHGG KLGMHN FNG S ATVRA 

4132S tgggttacagacacgcgaggaaaacaatcgaacgcccaagacgtatctatcaatgttatagaacactatggaccgtctatcaat 

309 WVTDTRGKQSNVQDVSINVIBYYGPSIN 

41409 ttctccgttcaacgtactcgtcaaaatcctgcaactatccaagctcttcgaaatgctaaggtcgcacctataacggtaggaggt 

337 FSVQRTRQNPAI IQALRNAKVAPITVGG 

41493 caacagaaaaacatcatgcaaattacctcctccgtggcgccgttgaacactactaatttcacagaagatagaggttcggcgtca 

365 QQKNIMQITFSVAPLNTTNFTEDRGS AS 

41577 gggacgttcactactatttccctaatgactaactcgtccgcgaacttagctggtaaceacgggccggacaagtctcacacagtt 

393 GTPTTISLMTNSSANLAGNYGPDKSYIV 

41661 aaggctaaaatccaagacaggttcacttcgactgaatttagtgctacggtagctaccgaatcagtagttcttaactatgacaag 

421 KAKIQDRFTSTEFSATVATBSVVLMYDK 

41745 gacggtcgacttggagttggtaaggttgcagaacaagggaaggcagggtcaattgatgcagcaggtgatatatatgctggaggt 

449 DGRLGVGKVVE QG KAGS IDAAGD ZYAGG 

41829 cgacaagttcaacagtttcagctcactgataataatggagcactgaacaggggtcaatataacgatgtttggaataagcgtgaa 

477 RQVQQPQLTDNNGALNRGQYNDVWNKRE 

41913 acagagtttacatggcgaagtaacaaatacgaggacaaccctacgggaactcgaggtgaatggggactatttcaaaatttctgg 

505 TEFTWRSNKYEDNPTGTRGBWGLFQNFW 

41997 ttagatagctggaaaatggttcaatccttcattacaatgtcaggaagaatgttcateaggacagcgaacgatggaaacagctgg 

533 LD9WKHVQSFITMSGRHFIRTANDGNSW 

42081 agacctaacaagtggaaagaggttctatttaagcaagacttcgaacagaataactggcagaaacttgttettcaaagtgggtgg 

561 RPNKWKEVLPKQDFEQHHWQKLVLQSGW 

42165 aaccat cac t caacct at ggcgacgcat c ccat t cgaaaac tct tgacggcatagt atat c cgagaggaaa t gt gcat aaagga 

589 SHHSTYGDAFYSKTLDGIVYLRGHVHKG 

42249 cttatcgacaaagaggctactattgcagtacttcctgaaggatttagaccgaaagtttcaatgtatcttcaggctctcaataac 

617 LIDKEATIAVLPEGFRPKVSMYLQALNN 

42333 t cat at ggaaa t gc cat t c t at gt at ataca ct gacggaagact tgt ggt gaaat cgaa tgt agat aat tct tggt t aaat 1 1 a 

645 SYGNAILC1YTDGRLVVKSHVDNSWLNL 

42417 gacaacgtctcatttcgtatttaa 42440 

673 DNVSFRX * 
dplORPOOS 

23674 atggctaaaaaatcaaaagctatctcacacacagacgaactgattagtcagtcgtttgacagccccttggcaaagaatcaaaag 

1 MAKKSKAISHTDBLISQSPDSPLAKNQK 

23758 ttcaagaaagagcttcaggaagttgaaaagtattatcaatacttcgacggatttgatgtcacggacttgaatactgactatggg 

29 FKKELQEVEKYYQYPDGPDVTDLNTDYG 

2384 2 caaacaeggaagattgacgaagactcagtcgactataaacctactcgagaaattcgaaactatattcgacaacttatcaaaaag 

57 QTWXIDBDSVDYKPTREIRNYIR-Q -K V 

23926 caatcacgctttatgatgggtaaagagccagagcttatctctagtccagttcaagacaatcaagatgaacaggctgagaacaag 

85 QSRFMMGKEPELI FSPVQDNQDEQABNK 

24010 cgtattctattcgactctattttaaggaattgtaaattctggagcaaaagtacaaatgcattagtcgaegccacagtaggcaag 

113 RILFDSILRNCKFHSKST NALVDATVGK 

24094 cgggtattgatgacagtagtagcaaatgccgctcaacaaattgacgtccagttttattcaatgcctcagttcacctatacagtt 

141 RVLMTVVANAAQQIDVQPYSMPQFTYTV 

24178 gaecctagaaacecttccagcttgctttctgttgacattgtttatcaggacgagegtacaaaaggaatgagcactgaaaaacaa 
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169 DPRNPSSLLSVDIVYQDERTKGMSTEKQ 

24 262 ctttggcatcattatagatatgaaatgaaagccggaacaagtcaatcaggaattgcaacagcttcagaagacattgaagaacaa 

197 LWHHYRYEMKAGTSQSCXATALEDIEEQ 

2434 6 tgttggctcacctatgecctaacggatggagagEcgaaccaaacctatatgacagaaagtggccaaactactatcaaggagaca 

225 CWLTYALTDGESNQIYMTESGQTTIKET 

244 30 gaggctaaacttgtagaaattgaagacaacctaggaaacaagattgaagttccttcaaaagttcaagaatccgccccaaccggc 

2S3 EAKLVEIEONLGNKI EVPLKVQESAPTG 

24514 ttgaagcaaattcctcgtcgagtcactcctaacgaaccattgactaatgacatatacgggacaagcgatgccaaagaccctatc 

281 LKQIPCRVILNEPLTNDIYGTSDVKDLI 

24598 acagcagcagataacttgaacaaaactattagtgacttacgagatccacttcgatttaaaatgttcgagcagcctgttatcatt 

309 TVADNLNKTISDLRDSLRPKMPEQPVI I 

24682 gatggctcttctaagtcaattcaaggaatgaagattgcgccaaacgctttggtcgaccttaagagcgaccctacctcctcaatc 
337 - DGSSKSIQGMKIAPNALVDLKSDPTSSI 

24766 ggcggtactggaggcaageaagcccaagtcacttccatctcaggaaacttcaacttccttccagcggctgaatattatttagag 

365 GGTGGKQAQVTSISGNPNFLPAAEYYLE 

24a SO ggcgccaagaaagccatgtatgaactaatggaccagccaatgcctgaaaaggtacaggaggcgccatcaggaatcgcaacgcag 

393 GAKKAMYELMDQPMPEXVQEAPSGIAMQ 

24 934 ttcttattctacgacctaatttctcgatgtgacggaaaatggattgagtgggatgatgctattcaatggctcattcaaacgctg 

421 FLFYDLJSRCDGKWIEWDDAIQWLIQML 

25018 gaagaaatcttagcaacagtgaacgttgacttgggaaatattcctcaagatattcaaccaagctatcaaacacttacgacaatg 

449 EEILATVNVDLGNI PQDIQSSYQTLTTM 

25102 actatcgaacaccactatccaattcctagcgacgaactttctgctaagcaacctgcgctcactgaagttcaaaccaatgtacgc 

477 TIEHHYPIPSDELSAKQLALTEVQTNVR 

25186 agccaccaatcttacattgaagaattcagtaagaaggaaaaggcggacaaggaatgggaacgcactctggaagaacttgctcag 

505 SHQSYIEEFSKKEKADKBKERILEELA Q 

25270 cttgacgaaatctcagctggagcattgcctgtattagcaaacgaattaaacgaacaagaggagcetcaagatgaaacgagtgaa 

533 LDEISAGALPVLANELNEQEBPQDETSE 

25354 gaagacgaagt c gacgacaaagaaaaagaacaaactgaacaaccaac cgaagaaggagt cgacc cagacg 1 1 caaggt t aa 
25434 

561 EDEVDDKEKEQTEQPTEEGVDPDVQG* 
dplORPOOfi 

45296 atgattgaaatcgctatagcacgttcgaaagctaggcgaggtcgaaccctatttattgaaacacgggcaagcaccgatgaagat 

1 MIEIVIARSKARRGRTLFIBTWASTDED 

45380 gcagttaaaatggcagaaaagatctccagcccgcccaatgcagccgagacgtcttctaataacctcgaactaccttacaagtat 

29 AVKMAEKISSLPNVVBTSSNNFELPYKY 

45464 ttcaataatgttatagacgctctagatgaatgggagcttcacatcttcggcgaacttgataaagatgttcaagactacattgac 

57 FtfNVIDALDEWBLHIPGBLDKDVQDYID 

45548 tcccgaaaccgaatagcttcttcaagcaatgagcagtttccgttcaagactactccattcgcgcaccaggttgaatgtttcgaa 

85 SRNRIASSSNEQFSFKTTPFAHQVECPE 

4S632 tacgcacaagagcatccatgtttcctcttaggcgacgagcaaggtttagggaaaactaaacaggcaactgatattgcagttagc 

113 YAQEHPCFIiLGDEQGLGKTKQAIDZAVS 

45716 aggaaggcaagtttcaaacattgtttaatcgtatgttgcatatcagggctcaaatggaattgggcaaaagaagtaggtactcat 

141 RKASFKHCtlVCCISGLKWNWAKEVGlH 

45800 tcaaatgagtcagctcacattttaggaagccgagtcaccaaagatgggaaattagtgattgacggagtttctaaacgggcagaa 

169 SHESAHILGSRVTKDGKLVI DGVSKRAE 

45884 gacttgctcggtggccacgacgaactcttccttatcactaacattgaaactcttcgcgatgccgtgcccactaaatacctaaat 

197 DLLGGHDEFFLITUI BTLROAVFI KYLN 

45968 gaactgacaaaaagcggagaaattggaatggttattattgacgagattcacaagtgtaagaacccttcaagtaagcaaggggct 

225 BLTKSGBIGMVIIDEIHKCKNPSSKQGA 

46052 tcaattcaaaagcteeaaagttattacaagatgggacttacaggaactcctctaatgaataacccaatcgatgtattcaaEgtt 

253 SIQKLQSYYKMGLTGTPLMNNPIDVFNV 

46136 atgaagtggctaggggcggaacatcatacactgactcagttcaaagagcgatactgtatcgtcgaccagttcaatcaaaccacc 

281 MKWLGAEHHTLTQFKERYCIVDQFKQIT 

46220 ggat at cgaaatc t agccgaact t cgcgagc 1 1 gt caacgact acatgct tagaagaacgaaggaagaagt t tt agacctgcc t 

309 GYRNLAELRELVNDYMLRRTKEEVLDLP 

46304 gaaaagattcgagtcacagagtatgtcgacatga&ctcgaaacagtcaaaaatctataaggaagttttgactaaacttgttcaa 

337 EKIRVTEYVDMMSKQSKIYKEVLTKIjVQ 

46388 gaaatagataaagtcaagctcatgcctaaccctctagccgaaacgattcgacttcgacaagcgactggaaatccttcgatttta 

365 EIDKVKLHPNPliAETlRLRQATGNPSIL 

46472 actactcaagatgtcaagtcttgcaagttcgaaagatgtatcgaaattgtcgaggaatgtatccagcaaggaaagtcctgcgtg 

393 TTQDVKSCKFERCIEIVEECIQQGKSCV 

46556 acatctagcaattgggaaaaggttatcgaacctcttgctaagatactttcgaagacagtcaaacgcaacetggxaacaggagaa 

421 IFSNWBKVIBPLAKILSKTVKCNLVTGE 

46640 accgcagataagt t eaacgaaat t gaagaat t c acgaaccacagaaaggcc tec gc t at 1 1 taggaact ataggtgeget agga 

449 TADKFNEIBEFMNHRKASVIUGTIGALG 

46724 acaggatttactttgacgaaagcggatacggttattttcttagatagtcggtggacacgcgcagaaaaggaccaagccgaagat 

477 TGFTLTKADTVIFLDSPWTRAEKDQAED 

46608 aggtgtcatagaactggcgcaaaaagttctgtcactatctacacgcttgtcgccaaaggtactgttgacgaacgtatagaaga'c 

505 RCHRIGAKSSVT Z YTLVAKGTVDEJIIED 

46892 cttactgaacggaaaggagaattagcagattatatcgtagatggtaagcctatgaaatctaaaatcggtaaccttttcgacatc 

533 LI ERKGELADYIVD GKPMKSKI GNLFDI 

46976 ctgectaaatag 46987 

561 L L K * 
dplOR7007 

22230 atgacaataagcccgagaaataaactacctaagttcaacttcgtcccttttagtaagaaacaactccagctcctaacatggtgg 
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1 MTISLRNKLPKFNFVPFSKKQLQLLTWW 

22314 acaaagggctcaccttttcgaacttccgatatcgccacagcagacggtcccacccgttcaggaaaaacagtatcgacggctctt 

29 TKGSPFRTFDIVIADGSIRSGKTVSMAL 

22398 tcattttccctttgggccatgacggaattcaacggacaaaactttgccacctgtggtaagacaattcactcagctcgacgaaat 

57 S PS LWAMTEFNGQMFAICGKTIHSARRN 

22482 gttattcagcctctaaagcaaatgctcacaagtcgcgggtatgaaattcgagatgttcgaaatgaaaatctacttattattaga 

65 VIQPLKQMLTSRGYEIRDVRHENLLIIR 

22S66 cactttagaaatggcgaagaaatcgtcaactacttctatatatttggaggaaaagatgagtcgagtcaagaccttatacagggg 

113 HFRMGEEIVNYFYIFGGKDESSQDLIQG 

22650 gtaacattagcaggtatctcctgtgatgaggtggcaccgacgcctgaaccgtttgtcaaccaagcgacagggcgctgttccgta 

141 VTLAGI FCDEVALMPESFVNQATGRCSV 

22734 acaggtccgaaaatgtggccctcctgraacccggccaatcctaatcactacttcaagaagaactggattgacaaacaggtcgaa 

169 TGSKMWPSCNPANPNHYFKKNW1DKQVE 

22818 aagcgtatcttatatcttcactttacaatggacgacaaecctagcttgacggatagcaetaaaaggcgctatgagaaaatgtat 

19 7 KRI LYLHFTMDDNPSLTDS I KRRYBKMY 

22902 gctggagtcttcaggaaaagatttattctcggcctttgggtaacagcagatggtetagtttattcaatgttcaatgaagagcag 

225 AGVFRKRFILGliWVTADGLVYSMFNEBQ 

22986 catgtcaaaaagctcaatatagaattcgaccgtttattcgtagcaggcgactttggtatctataatgcaacaaccttcggcctt 

253 HVKKLN I EFDRLFVAGDFG I YNATTFGL 

23070 tatggattctcgaaacgtcataagcgctaccatctaattgagtcatactaccactcagggcgcgaggcggaagagcaactaact 

281 YGFSKRHKRYHLIESYYHSGREABBQLT 

23154 gaggcggatgttaattcgaatattcaatttagttcagttctacaaaagactactaaagagtacgcaaatgatttagtcgatatg 

309 EADVNSNIQFSSVLQKTTKEYANDLVDM 

23238 atacgaggaaagcaaat cgaat at at aat t ct cgacc cgt c tgct t ct get at gat tgttgaac 1 1 caaaagcat ccttatata 

337 IRGKQIEYI ILDPSASAMIVEliQKHPYI 

23322 gctagaaagaatatccctatcattcctgctcgaaatgacgtgacgcttggcatttcatttcacgctgaactcttggctgagaat 

365 ARKN I PI I PARNDVTLGIS FHAELLAEN 

23406 agatttacactcgaccctagcaacacgcacgacattgatgaatactatgcttacagctgggacagtaaagcgagccaaacggga 

393 RFTLDPSNTHDIDBYVAYSWDSKASQTG 

234 90 gaagatagagtcattaaagagcatgaccaccgcatggataggaacagatatgcctgtctcactgacgctctaatcaacgatgac 

421 BDRVIKEHDHCMDRNRYACLTDALINDD 

23574 ttcggtttcgaaatacaaatattatccggaaaaggcgctagaaactaa 23621 

449 FGFEIQI LSGKGARN* 

dpiosrooa 

49624 gtgatacagcttcaagtcttaaataaagttctcgaagaaaagagcttatccattttagaaaaeaatggaattgaccaagaatac 

1 VIQLQVLNKVLEBKSI*SI L5NNGZ0QBY 

49708 ttcacggattatttagacgagtatcaatttattcaagaacacttttcgagatatggaagagttccggacgacgaaactattctc 

29 FTDYLDEYQFIQBHFSRYGRVPDDE'TIL 

49792 gaccattttcctggattcgaatttttcgaaattggcgaaactgatgaataccttatcgacaagctaaaagaggagcatctatat 

57 OHFPGFEFFE IGETDBYLIDKIiXBBHLY 

4 9876 aattcacttgttccaaetttaacggaagcggctgaggacattcaagtagatagtaacattgcgattgcgaatataattccaaaa 

BS NSLVPILTBAAEDIQVDSHIAIANI1PK 

49960 ctagaagaacttttcaatcgctctaaattcgtaggcggaccagacattgctcgaaatgctaaacctcgactagactgggcgaat 

113 LEBLFNRSKFVGGLDIARNAKLRLDWAN 

50044 actattagaaaccatgacggtgaaagacttggaatatcgacagggtttgaactattggacgacgtgcttggaggcttacttcct 

141 TIRNHDGBRLGISTGFELLDDVLGGLLP 

50128 ggtgaggatttgattgtcataatggctcgacctggacaaggtaagtegtggactattgataaaatgcttgcaactgcttggaag 

169 GBDLIVIMARPGQGKSWTIDKMLATAWK 

50212 aacgggcatgatgcccttctatatagcggggaaacgagcgaaatgcaagttggtgctcgtatagatactattctttcgaatgtt 

197 NGHDVLLYSGBMSEMQVGARIDTILSNV 

50296 agcatcaattcaattaccaaagggatttggaacgaccaccagttcgaaaaatatgaggaccatattcaagcaatgactgaggct 

225 SIMSITKGIWNDHQFEKYBDHIQAMTBA 

50380 gaaaattcccttgtggtagtcacgccctctatgattggaggaaagaaccttacccctgcaattttagatagcatgatatctaaa 

253 BNSLVVVTPFMIGGKNLTPAILDSMISK 

50464 tatagaccatctgtggtggggactgaccagctttcactcatgagcgagtcttatccaagcagggagcagaagcgaatccagtac 

281 YRPSVVGIDQLSLMSESYPSREQKRIQY 

50548 gccaacatcaccatggacctatataagatttccgccaaatatggaattcctattgtgcttaatgtccaagcagggcgttcggct 

309 AHITMDLYKI SAKYG I PIVLNVQAGRSA 

50632 aaaactgaaggcgctgaaagtaeggaactagaacacacagcagaaagtgatggagtaggtcaaaatgctagcagagttatcgct 

337 KTBGAESMELEHIABSDGVGQNASRVIA 

50716 atgaagcgtgacgaaaaat ceggcat ac c tgaac t at ct gt cgt t aaaaaccgat at ggcgaagaccgaaaaat c at cgaat at 

365 MKRDEKSGILBLSVVKNRYGEDRKIIEY 

50800 atgtgggacgttgaaactggaacctatactcttataggattcaaagaggaaggcgaagaaggaactgaaaaaggcgaaagctct 

393 MWDVETGTYTLIGFKEEGEEGTEKGESS 

50884 ccattgaaagcaaaagcctctaggtcgactgctcgtcctcgaagtaaggtcacaagggaaggagttgaagcattttga 50961 

421 PLKAKA SRSTARLRSKVTREGVBAF * 
dplOR7009 

13160 atgacagactttaaaaaacgcttcaagaaagcagtaacagaaacaatcaatcgtgacggtatcgagaaccttatggattggctc 

1 MTDFKKRFKKAVTETINRDGIBN-LM_D-WL 

13244 gaaaatgataccaatttcttctcaagtccagcaagcactcgacaccatggaagctatgaaggtggacttgtcgagcactcatta 

29 ENDTNFFSSPASTRYHGSYBGGLVBHSL 

13328 aacgtgttcaatcaactacttttcgaaatggataccatggtaggcaaaggctgggaagacatttacccaatggaaacagttgca 

57 NVFMQLLFEMDTMVGJCGWEDIYPMETVA 

13412 atcgtagcactatttcacgacctttgcaaagttggtcagtatcgcgaaactgaaaaatggcgcaagaacagcgacggtgaatgg 

85 IVALPHDLCKVGQYRETBKWRKNSDGEW 

13496 gaaagctatttagcatatgaatacgaccccgagcaactcacaatgggacatggtgcaaaatctaatttccttcttcaacgttcc 
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113 esylayeydpeqltmghgaksn fllqrf 

13580 attcaacccacgccagccgaagctcaagcaatcccccggcacacgggagcctacgatattagtccccatgcaaatttgaacgga 

141 IQLTPVEAQAIFWHMGAYDIS .PYANLNG 

13664 cgcggagcagcctccgaaactaacccacttgcatccttaatccatcgcgcagatatggccgcaactcacgtagtcgaaaacgaa 

169 CGAAFETNPLAFLIHRADMAATYVVENE 

1374B aacttcgaatactctcaaggtccagttgaacaagaggccgaggctgaagaagtagttgaagaaaaacctaagagttcaactcgt 

197 NFEYSQGPVEQEAEVEEVVEEKPKSSTR 

13832 aagaa acccgcgcct aaggaagaaaaagt t gaagagg ccgaagaaaaaccaaaagct ggaat cac t cgacgt cgcaaac ccgcg 

225 KKPAPKEEKVEEAEEKPKAG ITRRRKPA 

13916 cc aaaagaggaagaggt agaagag cc t aaagaagagc ct aagaaagcat c 1 1 c taaaat c cgaa c gcct aaaaagac tgaaaag 

253 pKEEEVBEPKEEPKKASSKIRMPKKTEK 

14000 gccgaagaggtagaaagcgcagacgagccgaaagtcgaagaagcagaggacgacaatgtggtggtacctgccggacacgttcga 

281 vEEVESADEPKVEEAEDDNVVV PAGYVR 

14084 gatgtctactactcccacagtgaagtcgctgacgtttactacaagaaagatgtcgacgagcctgacgatgacagcgacattctt 

309 DVYYFYSBVADVYYKKDVDE PDDDSDI L 

14168 gtagacgaagaagagtacatggacgcaatgtgtcccgtattagaagaagacttcttctacgaacttgacggcaaggcccacaaa 

337 VDEBEYMDAMCPVLEBDFFYELDGKVHK 

14252 ttagcaaaaggtgaacgcttgccggaagaatacgacgaagaaacttgggaacctaccactgaagcagaatacaccaagcgaaca 

365 LAKGERLPEBYDBETWBPITEAEY1KRT 

14336 gaaaaacctaaagcagttgcaaaacctacccgaaaaactccagcgccttctcgtcgccctcgcccttaa 14404 

393 EKPKAVAKPTRKTPAPSRRPRP * 

dplORFOlO 

8699 atgaaattggaacagttgatgaaggactggaataaggattcgaaagctcttgtagcagttcaaggacttgaacgtgaagcgctt 

1 MKLBQLMKDWNKDSKALVAVQGLERSAL 

8783 ccaagaatccctctttccgcgccttctatgaactatcaaacctacggcgggctcccccgaaaaagggtagttgaattcctcggt 

29 PRI pFSAPSMNYQTYGGLPRKRVVEFFG 

8867 cctgagccaagtgggaaaactacttcagctctcgacaccgccaagaatgcgcaaatggtattcgagcaggaacgggaacagaag 

57 pESSGKTTSALDIVKNAQMVFEQEWBQK 

8951 actgaagaactcaaggaaaagccggaaaatgcgcgtgcatccaaagctagcaagactgctgtcaaggaacttgaaatgcaactc 

85 TBELKEKLENARASKASKTAVKELBMQL 

9035 gatagtcttcaagagcctctcaagattgtatatcttgaccttgagaatacattagacaccgagtgggctaaaaagactggagtc 

113 dslQBPLKIVYLDLEMTLDTEWAKKIGV 

9119 gatgttgacaatattcggatagttcgccctgaaatgaacagcgctgaagaaatacttcaatatgctttagacattttcgaaaca 

141 dvdhiwivrpemmsabs .ilqyvldifet 

9203 ggtgaagttggcctagtagttctagattccctgccttacatggtcagtcaaaaccttattgatgaagagttgactaaaaaggcc 

169 GEVGLVVLDSLPYMVSQSLI DEELTKKA 

9287 tatgcaggaatctcagcgcccttgaccgaacctagtcgaaaggtcactcctcttcttactcgctacaatgcaatattcctaggc 

197 YAGISAPLTBFSRKVTPLLTRYNAI FLG 

9371 at caatcaaat tcgagaaga t at gaatagt cagt acaatgcctat t caac t ccaggcggaaagatg tggaagcat gc t tgtgca 

225 IHQIREDMNSQYNAYSTPGGKMWKHACA 

9455 gttcgacttaaatttagaaaaggtgactaccttgacgaaaacggtgcatcattgacccgtactgctcgaaaccctgcagggaat 

253 VRLKFRKGDYLDBNGASLTRTARMPAGN 

9539 gtagtagagtcattcgtcgagaagaccaaagcatttaagccggacagaaaattagtttcctatacgctttcctatcatgatgga 

28i VVBSFVBKTKAFKPDRKLVSYTLSYHDG 

9623 attcaaattgaaaatgaccttgtagatgtcgctgtcgaatttggagtcattcaaaaggcaggggcatggttcagtatcgtcgac 

309 IQ I BNDLVDVAV EPGVI QKAGAWFS IVD 

9707 cttgaaactggagaaattatgacagatgaagacgaagaaccattgaagttccaaggcaaggcaaatctagttcgacgcttcaag 

337 LETGBIMTDEDEEPLKFQGKAMLVRRFK 

9791 gaggatgactacttattcgacatggtgatgactgcggttcacgaaattatcactcgagaagaaggctaa 9859 

365 BDDYLFDMVKTAVHEIITRE8G * 
dplORFOXl 

28017 atgaatatttatgattatatcaacgcaggggagattgctagctacattcaagcacttccttcaaacgctcttcaataccttgga 

1 MHIYDYIHAGEIASYIQALPSNALQYLG 

28101 ccaactcttttccctaatgctcaacaaacagggacagacatttcatggctcaagggtgcaaataatttgccagtaactatccag 

29 PTLFPNAQQTGTDISWLKGANSLPVTIQ 

28185 ccatctaactacgacgcgaaagcaagtcttcgtgaacgtgctggatttagcaaacaagctactgagatggcattcttccgtgag 

57 psNYDAKASLRERAGFSKQATEMAFPRB 

28269 tctatgcgacttggtgaaaaagaccgtcaaaacttgcaaatgctattgaaccaaagttcagctcttgcccaaccacttatcact 

65 SMRLGBKDRQNLQMLLNQSSALAQPilT 

28353 caactctataatgatactaagaaccttgtagacggtgttgaagcgcaagcagaatacatgcgcatgcaattgcttcaatacggt 

113 QLYNDTKMLVDGVEAQAEYMRMQLLQYG 

28437 aaat tcact gtcaaatcaact aacagcgaggc t caat acact tacgact acaaca tggatgct aagcaacaatatgcagt cac t 

141 KPTVKSTNSEAQYTYDYNMDAKQQYAVT 

28521 aagaaatggact aacccagctgaaagt gaccct at cgc t gacat 1 1 1 agcagcaatggatgaca t cgaaaat cgt acaggtgt t 

169 KKWTNPAESDPIADILAAMDDIENRTGV 

28605 cgccctactcgaatggtcttgaaccgaaacacttataaccaaatgactaagagtgactctatcaagaaagctcttgcaattggc 

197 RPTRMVLHRMTYNQMTKSDSIKKA L A I G 

28689 gttcaaggttcttgggaaaacttcttgcttcttgcaagtgacgctgagaaattcatcgctgaaaaaacaggcct^caaatcgct 

225 vQGSWBNFLLLASDAEKFIABKTGJiQIA 

28773 gcctactctaagaaaattgctcagttcgctgacgctgacaaacttcctgacgttggtaacattcgtcagttcaacttgattgac 

253 VYS KKIAQFADADKLPDVGN I RQFNLIO 

28857 gacggtaaagtggtattgcttccacctgacgcagttggtcacacttggtacggtactactccagaagcattcgacttggcttca 

281 DGKVVDLP PDAVGHTWYGTT PBAPDLAS 

28941 ggcggaacagacgctcaagttcaagttctttcaggcggacctaccgttacaacttatcttgaaaaacatcctgtcaacattgca 

309 GQTDAQVQVLSGGPTVTTYLEKHPVNIA 
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29025 acagtegtatcagctgttatgatcccatcattcgaaggaattgactatgtaggagctctcacaaeeaateag 29096 

337 TVVSAVMI PS FEGIDYVGVLTTN* 
dplORFOU 

5346 atgagcattaageecaaaaccgaagaactttcaaaaattgtttctcagctcaataagttgaagcctagcaagttgctagaaatc 

1 MSIKFKTEELSKIVSQLNKLKPSKLLEI 

S430 acaaaccactggcatatttttggtgacggcgaacgcgtcatgtccacagcgtatgatggctcaaacttcccccgacgcactatc 

29 TNYWHI FGDGECVMFTAYDGSNFLRCII 

5514 gacagcgatgtcgaaattgacgcgattgcgaaagcagagcagtttggaaaacccgtagaaaagaccacggccgcaaccgtcaca 

57 DSDVE I DVIVKAEQFGKLVEKTTAATVT 

55 98 ctagttcctgaagaaccttcgctaaaagttaetgggaatggtgagtacaatattgatattgttacagaagatgaagagtaccct 

8S LVPEESSLKVIGNGEYNIDIVTEDEEYP 

5682 acaetcgaccacttgctcgaagacgtgagtgaagaaaatgctctcactttgaaaagctcgctgttctacggaatcgccaatatc 

113 TFDHLLEDVSEENALTLKSSLFYGIANI 

5766 aacgaccctgcggtatctaaatcaggagcagatggaacttataccggcttcctgttaaaaggcggaaaagcaattactacagac 

141 NDSAVSKSGADGIYTGFLLKGGKAITTD 

5850 atcatecgcgtatgtatcaaccetatcaaggaaaagggactagaaatgctcattccteacaacctaatgagtattttagcaagt 

169 IIRVCIHPIKEKGLBMLIPYNLMSILAS 

5934 attcccgatgagaagatgtacctctggcaaactgacgacactactgtctatatttcatcggcctcagtcgaaatttatggaaaa 

197 IPDBKMYFWQIDDTTVYISSASVEIYGK 

6018 ttgatggaaggtatggaagattatgaagacgtttcacagcttgactcaattgagtttgaagacgatgcggccatccccacagca 

225 LMBGMBDYEDVSQLDSIEFEDDAAIPTA 

6102 gaaatccrgagcgtattagaccgccttgtaccattcacctcagcctttgacaaaggaaccgtcgaatitcctactcttgaaagac 

253 BILSVLDRLVLPTSAFDKGTVEFLFLKD 

6186 cgacttcgaaeeaaaacttctactagcagttatgaagacatcatgtacgcatctgctggcaagaaagcctcgaagaaagaattc 

281 RLRIKTSTSSYEDIMYASAGKKVSKKEF 

6270 acttgccaccttaacagcttactctttgaaggaaattgtatcaaccgtcaccgaagaaaacetcactgtctcttatggaagcgaa 

309 TCHLMSLLLKEIVSTVTEEMFTVSYGSB 

6354 accgcaattaagactecaccgaatggtgccgtttacttcctagcacttcaagagccggaagaataa 6419 

337 TAIKISSMGVVYFLALQEPBB* 
<tplORF013 

10215 atgaatctagcttctaaataccgtccecaaactttcgaggaagtggtagctcaagaataegteaaagaaattcttttgaaecaa 

1 MMLASKYRPQTFEEVVAQBYVKBILLMQ 

10299 ccacaaaatggcgctatcaaacacggctatctattctgtggtggcgctggaactggtaaaaccactactgcccgaatcttcgcg 

29 L Q N G A I KHGYLPCGGAGTGKTTTARI FA 

10383 aaggatgtgaacaaaggacttggctctcctactgaaattgatgctgcttctaataatggggtagaaaacgttcgaaacattatt 

57 KDVNKGLGSPIBIDAASNNGVBNVRKII 

10467 gaagactctagatacaagtctatggacagcgagtccaaagtttacatcattgacgaggttcacatgccttcaaccggagcattt 

85 BOSRYKSMDSEFKVYIZDBVHMLSTGAF 

10551 aatgegctgttgaaaacattagaagagccctcatcgggaaccgtgttcattctatgtactactgaccctcaaaagattcctgac 

113 H A L L K T LEE P SSGTVF I LCTTD PQKZ PD 

10635 actattctcagtcgagttcaacggttegactttactcgaattgataatgacgacatcgttaatcaacttcaatttattatcgaa 

141 T ZLS-RVQRPDFTRZDHDDIVNQLQFI IB 

10719 agtgaaaatgaagaaggagctggttatagctatgagcgtgacgccccttcgtttattgggaaacttgcaaatggaggaatgcgt 

169 SEMEEGAGYS YERDALSF I G KLANGGMR 

10803 gacagtatcacaaggctcgaaaaagtccttgattatagtcatcacgttgacatggaagccgrttctaatgcactaggagttccg 

197 DSITRLSKVLDYSHHVDMEAVSMALGVP 

10887 gactacgaaacae ecgct t cac t cgt t gaagctatcgccaaccacgacggct caaagtgt 1 1 agaaat cgcaaatgact tccac 

225 DYBTFASLVBAIANYDGSKCLBIVNDFH 

10971 tacccaggaaaagacttgaaattagcgactcgaaaetctacagactcccttttagaggtttgtaagtateggccagttcgagat 

253 YSGKDLKLVTRNFTDFLLEVCKYWLVRD 

11055 atttcaatcacccaacttcctgctcatcttgaaagcaagccagagcaatcctgtgaggcttttcaatatcctactctattgtgg 

281 ISITQLPAHFESKLEQFCBAFQYPTLIiW 

11139 atgctagaagaaatgaatgaacttgctggagtcgttaaatgggagcctaatgctaaaccgataattgaaaccaaacttcttttg 

309 HLEEMNELAGVVKWBPHAKPZ IBTKLLL 

11223 atgagcaaggaggagtga 11240 

337 H S X B B * 
dplORFOK 

50961 atgaaagt aaatggt ct tcaaac cgaag cgact cct gaacaaa taat tgaaaaac 1 1 1 cgagac aac tcgaagacgaaggaaca 

1 MKVNGLQI EATPEQI I E K L S RQLEDEGT 

5104S ttcatttttagacgaactaagtcgcttggaagcaactatcaattctcatgcccgtttcatgcaggagggactgaaaagcatccc 

29 PIFRRTKSLGSNYQFSCPPHAGGTEKHP 

51129 tctcgcggcatgagtagaaatcctccttatccaggaagtaaggcgacggaagccggaacggctcaccgtttcacccgcggctac 

S7 SCGMSRMPSYSGSKVTBAGTVHCFTCGY 

51213 acttcaggactaactgaattcgtctcgaacgcactaggtcgaaacgatggagggtcctatggaaaccagtggctgaaaaggaat 

85 TSGLTEFVSNVLGRNDGGFYGNQWLKRH 

51297 tttggaacatctagcgaagtagteaggcaaggcgtcagccctgaagcgtttcgaagaaatgggagaactgaaaaagtcgagcat 

113 FGTSSEVVRQGVSPBAFRRMGRTBKVBB 

51381 aaaatcattcctgaagaggaacttgataaacaccggtctattcatccttatatgtacgaacggaaactgacggacgagctcatc 

141 KII PEEELDKYRFIHPYMYBRKL-TD_E1iI 

51465 gagatgtttgatgtaggttatgacaaactgcatgactgcatcacctttccagtacggaacctcaagggcgaaacagtattcttc 

169 EMFDVGYDKLHDCITFPVRNLKGBTVFF 

51549 aaccgtcgaagcgcccgttccaagcctcaccagcacggtgaagacgaccctaaaacggaattcccttacggccaacatgagctt 

197 HRRSVRSKFHQYGEDDPKTBFLYGQYBL 

51633 gtagcacttcgagactatcccgaaaaacccaccagccaagcattcgcgactgagcccgtcatcaactgcccgactctttggtca 

225 VAFRDYFEKP I SQVFVTESV I MCLTLWS 

S1717 atgaagactccagcagtcgctcctacgggagtaggcggaggaaaccaaaccaatttactaaaacgacttccttatagaaatatt 
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253 MKIPAVALMGVGGGtfQINLLKRLPYRNI 

51801 gctccagcacttgaccctgataacgctgggcagacagcgcaggaaaaactctaccgacagttaaagcgaagcaaggtcgttaga 

281 VLALDPDNAGQTAQEKLYRQLKRSKVVR 

5188S tttctgaactaccccaaagagtcctacgacaataagcgggatataaacgaccacccggaatcaccaaaccttaatgatttagtc 

309 FLNYPKEFYDNKWDINDHPELLNFNDLV 

51969 ttgtag 51974 

337 L • 
dplORFOlS 

3793 atgggatttaatctatacttcgcaggaggtcacgctattagcactgacgattaettgaaggaaagaggagccaaecgcctaetc 

1 MGFNLYFAGGH AI STDDYLKERGANRLF 

3877 aatcaact gtacgaaagaaacgggat tggcaaaaggtggatcgagcac aagaaaaccaat ccaagcact act C caaaactat t c 

29 NQLYERNGIGKRWIEHKKTNPSTTSKLF 

3961 gtcgactctagcgcatattctgctcataccaaaggggctgaagttgacaccgacgcccataccgaatacgtgaatgacaacgtg 

5? VDSSAYSAHTKGAEVDIDAYI EYVMDNV 

4 04 5 ggaatgtttgaccgcatcgccgaactcgacaaaatccccggcgtacttagacagcccaagacacgtgaacagctcctggaagca 

85 GMFDCIAELDKI PGVFRQPKTREQLLBA 

4129 ccacaaattccctgggataattatctatacacgcgcgagcgaacggctgagaaagacaagctcttacctattttccatatggga 

113 PQISWDNYLYMRERMVEXDKLLPIFHMG 

4213 gaagaccttaaatggctcaacttgatgctcgaaaccacactcgaaggcggaaagcatattccttacattggaatttcaccagcc 

141 EDFKWLNLMLETTFBGGXHI PYIGISPA 

4297 aatgactcgactaegaagcacaaagaeaagtggatggaaagagtattcgaagttattcgaaacagttccaatccagacgttaag 

169 NDSTTKHKDKWMERVFEVIRNSSNPDVK 

4381 actcacgcacttgggatgacagttactagccaactagagcgtcacccattctatagcgccgactccactcccgtaccgcccaca 

197 TKAPGMTVTSQLERHPFYSADSTSVLLT 

4465 ggagcgacgggaaacattatgacgtcaaaaggattagttgactcgtcacagaagaatggaggaactgatgctgtccgtaggctg 

225 GAMGNIMTSKGLVDLSQKNGGIDAVRRL 

4549 ccaaaaccggttcaagttgaaactgaatccactatcgaagaaactggagcgcatcttagcctagagcaaccagtcgaggaccac 

253 PKPVQVBIES I IEETGAHFSLEQLVEDY 

4633 aaacttcgagcattgttcaatgttcaatacatgctgaattgggcagagaactacgaatccaagggaactaaaaatcgtcaacgt 

281 KLRALFNVQYMLNWABNYBFKGIKNRQR 

4717 cgactattttag 4728 

309 R L P • 
dplORTOK 

43413 atgggagccgatattgaaaaaggcgttgcgtggatgcaggcccgaaagggtcgagtatcttacagcacggaccctcgagacggt 

1 MGVD IBKGVAWMQARKGRVSYSMDFROG 

43497 cccgacagctatgactgctcaagctctatgtactatgctctccgctcagccggagcttcaagtgctggatgggcagtcaatact 

29 PDSYDCSSSMYYALRSAGAS.SAGWAVNT 

43S81 gagtacatgcacgcatggcttattgaaaacggttatgaactaattagtgaaaatgctccgtgggacgctaaacgaggcgacatc 

57 EYMHAWLI RNGYELISENAPWDAKRGDI 

43665 ttcatctggggacgcaaaggtgctagcgcaggcgctggaggtcatacagggacgttcattgacagtgacaacatcatccactgc 

85 FIWGRKGASAGAGGHTGMFIDSDNIIHC 

43749 aactacgcctacgacggaatttccgtcaacgaccacgatgagcgttggtaccatgcaggtcaaccttactactacgtctatcgc 

113 NYAYOGISVNDHDERWYYAGQPYYYVYR 

43833 ctgactaacgcaaatgctcaaccggctgagaagaaacttggctggcagaaagatgctaccggtttctggtacgctcgagcaaac 

141 LTNANAQPAEKKIiGWQKDATGFWYARAN 

43917 ggaacttatccaaaagatgagttcgagtatatcgaagaaaacaagtcttggttctactttgacgaccaaggccacatgctcgct 

169 GTYPKDBPEYI EENKSWFYFDDQGYMLA 

44001 gagaaatggttgaaacatactgatggaaattggtatcggtccgaccgtgacggatacatggctacgccatggaaacggattggc 

197 BKWUKHTOGNWYWFDRDGYMAT3WKRIG 

4408S gagccatggtactacttcaatcgcgatggttcaatggtaaccggttggattaagtactacgataattggtatcactgcgatgcc 

225 BSWYYPHRDGSMVTGWI KYYDNWYYCDA 

44169 accaacggcgacatgaaatcgaatgcgttttacccgttataacgacggctggtatctactattaccggacggacgtctggcagat 

253 TMGDMKSNAPIRYMDGWYLLLPOGRLAO 

44253 aaacctcaatecaccgtagagceggacgggctcattactgctaaagtttaa 44303 

281 KPQPTVBPDGLITAKV* 
dplORP017 

11242 atgattggacagggacttgttaaatctaccatttcgaaatggaaacaacttccaaaatatataatcgtcgaaggtgaagtaggt 

1 MIGQGLVKSTISKWKQLPKYI IVEGBVG 

11326 tcaggacggaagaccttaatccgttatattgcttcgaaatttgacgccgattctattgtagtaggaacgagtgcagatgacatt 

29 SGRKTLIRYIASKFDADS 1VVGTSVDDI 

11410 cgaaacatcactcaggatgcacagactattctcaaggcgagaatccacgtgatagacggaaatagcctgccaatgtcagctctt 

57 RHIIQDAQTIFKARIYVIOGMSLSMSAL 

11494 aacccgctttcgaagacagcggaagagccacccctaaactgccatacagccatgactgtcgatagcatcaacaatgctctacct 

85 NSLLKIABSP PLNCHIAMTVDS I MNALP 

11578 acgcttgcaagtagagcaaaagttctaaccatgetaccttacactaatgaagagaaaacgcagtctgtcaagtcctacaagaag 

113 TLASRAICVIjTMLPYTNEBKMQPVKSYKK 

11662 gtagatacttcaggaattgacgaccgagcgattgtagactattgcaatcttgccagcaatctteaaatgcttgaagacatatta 

141 VDTSGIDDRAIVDYCNIjASMLQM-L B_D «J L" 

11746 gaatatggcgcagaagagctatctgaaaaggttacaacactctatgactcaacatgggaggcaagtgccagftaattcgccaaag 

169 EYGAEBI^FEKVTTFYDIilHBASASNSLK 

11830 gttactaactggcccaaacttaaggaaaccgatgaaggaaaaactgagcctaaactttccctcaactgcccttcaaaccggtcg 

197 VTNWLKFKBTDEGKIBPKLPLMCLLNWS 

11914 acagttgccatcaggaagcactatgtagaaacgcctcccgaagaacctgaggcccatgaccttttagcgagggaagcacctagg 

225 TVVIRKHYVBMSPBELEAKDLLVRBASR 
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11998 tgtttgcgaaaggtatctaaaaagggctcaaatgcgcgtgtctgcgtgaacgaatttaccaggagggccaaacaagttgagtga 
12081 

253 CLRKVSKKGSNARVCVNEPI RRVKQVE* 
dplOR7018 

3584 7 atggctagcagacagacgctattggtcgacggaaetgaccttgtcgacaaaggtgcaaccgcgctagaatatgtaggactcact 

1 MASRQTLLVDGIDLVDKGATVL EYVGLT 

35931 ctcgcaggacctaaggactcaggatccaaaaaccctgaaggcatagacggagtattagattctccgcccaacgctatgtccgct 

29 FAGPKDSGFKNPEGIDGVLOS PSNAMSA 

36015 cttactggaagcgtgaccttaatgttccacggagaaaccgaaaagcaagttaatcaaaaatacaggcagetcaaacaatttatt 

5 7 LTGSVTLMFHGBTEKQVMQKYRQPKQPI 

36099 cgcccgaagtcatcctggagaacctcgacacccgaagaccccggataccatcgaacgggaaaacctttaggagaaaccgagcaa 

85 rskSPWRISTLEDPGYYRTGKFLGETEQ 

361B3 ggaaaacttgtagacgctcaagcctttaaagacactcccctcgtagttaaattagggactcagcccaaagatgcctacgagtac 

U3 gklvdvqafkdtslvvklgiqfkoayey 

36267 agcgactcaactgttcgaaaggtttacaagttccaacccgctttgggaggcgatagcttacctaacccaggaagacctactcga 

141 sdstvrkvykfqpalggdslpnpgrptr 

363S1 caatttagageagaaataagaactacttctcaaatcaaaggacactttcgaattggcgaaaaaagttcaggacagtttgttgag 

169 qfrveirttsqikgyfrigbkssgqfve 

36435 ttcggcactaattcagtattgatggaaagcggcccgattattattctaaatcttggaacttttgaacttattaaaattagcagt 

197 pgtnsvlmesgsiiilnlgtfelikiss 

36519 gcaaatcaagcgactaacttatttagatacactaaacgaggcgcactcttcaagattcctaatggaaattcaacaatcaccatt 

22 5 ANQATNLFRYIKRGAFFKIPMGMSTITI 

36603 gaacaccgagccgatgacgcagcagctcggacctctactcttcccgctcaagttgaactgtttccaaacccgtcttactattag 

36686 

253 EYRADDAAAWTSTLPAQVELFLNPSYY* 
dplORPOl? 

12161 atgaatgtttatctcaatcaaatgggaaacgtagttcgagaaacttcggtttcaacagcctggaaaaccetcactcaaaaaggg 

i mhvylmqmgnvvretsvstvwktltqkg 

12245 ctcgtttctaatcatcgaatatccgctgcccgagatgacaaggagtteccgcctaacgagtcgaggtggaaaaggcttccggat 

29 LV SNHRIPAVRDDKEPt.SNESRWKRLPD 

12329 gtcagatatgggacacttgctttgatggttactaaaattgacaagcgaagcaagttgctaaaggcccttcccgacaattgtgtt 

S7 vrygtlvlmvtkidkrskllkafpdncv 

12413 gagtttgagaaaatgactgacgcgcagtcgaaaaggcaccctgtgtctaaatactcgactattgatagcgacacgattgacatg 

85 epbkMTDAQLKRHFVSKYSTIDSDMIDM 

12497 gt tat ccagt tctgt ct aaacgat t ac t c t agaat tgacaat gaat t ggacaagc tgt cgcgat tgaaaaaggt t gacgcat ca 

113 viQFCLNDYSRIDHELDKLSRLKKVDAS 

12S81 gtagttgaatccactgccaagcacaagaccgaaactgacatttccagcctagttgatgatgtattggaatataggccggagcag 

141 VVESIVKHXTEIDIFSLVDDVLEYRPBQ 

12665 gcaattatgaaagtgactgaactcttagccaaaggagaaagccctattggattgcttaccttgctttatcaaaattttaataac 

169 aimKVTBLLAKGESPIGLLTLLYQNFHN 

12749 gcttgtcttgtgctaggagccgatgagcctaaagaagccaatctaggcattaagcagttcttaatcaataagactgtctataae 

197 ACLVLGADBPKEAMLGIKQFLIHKIVYH 

12833 tttcaatacgagccggactcagcctttgaaggcatggctattttaggtcaagctaccgagggcataaagaatggtcgctacaca 

225 FQYELDSAPEGMAILGQAIEGIKHGRYT 

12917 gaaagttcagtggtctatatttctctgtataaaattttttcacttacttaa 12967 

253 BSSVVYISLYKIFSLT* 

dpioRvoao 

1864 aeggctaatcaatacaaccagcctgaaagaggcaagattcgaaccaacgttcgcgaccctgagaaaatgcctatcatggaaatc 

1 mvNQYMQPERGKIRINVRDPEKMPIMBI 

194 8 ttcggtcctacaattcaaggtgaaggaatggttacaggtcaaaagaccattttcattcgaactggtggatgcgactatcatcgc 

29 pgPTIQGBGMVIGQKTIFIRTGOCDYHC 

2032 aactggtgtgactcagcccttacccggaacggtactaccgagccggaatataccacaggcaaagaagctgctagtcgaatcttg 

57 nwcdsaptwngttepeyitgkeaasril 

2116 aaactagctttcaatgataaaggtgaacagatttgtaaccacgtgacattgactggaggaaatcccgccttaatcaacgagcct 

8S klafndkgeqicnhvtltggnpalinbp 

2200 atggccaagacgacttcgattctaaaagaacacggattcaagtctggtctcgaaactcaaggaactcgattccaagaatggttc 

113 makmisilkbhgfkfgletqgtrfqswf 

2284 aaagaagtaagcgatatcactattagtcctaaaccgccttcaagtggaacgagaactaatacgaaaattcttgaagccattgca 

141 KBVSD1TISPKPPSSGMRTNMKILBAIV 

2368 gatagaatgaatgatgaaaaccttgactggtcatttaaaatcgttacctttgacgaaaatgacctagcttatgcgcgtgatacg 

16 9 DRMNDBNLDWSPKIVI FDENDLAYARDM 

24 52 tttaaaactttcgaaggcaagttacgcccagtgaactacctctcagttgggaatgcaaacgcacacgaagaaggaaaaatcagt 

197 pktpbGKLRPVNYLSVGMANAYBBGKIS 

2536 gataggcttcttgaaaagttgggatggccttgggataaagtgtatgaagacccagctttcaacaatgttcgacctttaccgcaa 

225 drlleklgwlwdkvyedpafnnvrplpq 

2620 cttcatacacttgcttacgataataaaagaggagcacaa 2658 

253 LHTLVYDNKRGV* 
dplOR7021 

2504 atgcaaacgcatacgaagaaggaaaaatcagtgataggcttcttgaaaagttgggatggccttgggataaagtgtatgaagacc 

X MQTHTKKBKSVIGPLKSWDGFG I K £ M K T 

2588 cagctttcaacaacgttcgaccttcaccgcaacctcacacacctgtccatgataataaaagaggagtataaaatgaaaattgag 

29 QLSTMFDLYRNFIHLFMI IKBEYKMKXE 

2672 catctagataaaatcggtaacgcattagggagagagaacggatgggcttcccttaagccggacgaaattgtaacctcggacaat 

57 HLDK1GNVLGRENGWASLKPDE IVTLDN 

2756 actgaggcagccgttcaaagactttttggtctattaggcgaggacgcagaacgtgacgggtcgcaagatactccattccgtttt 

8S xeAAVQRLFGLLGEDAERDGLQDTPFRF 
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2 a 4 0 gt t aaagcaet cgctgaacataccgcagggtatcgagaagaccctaaact tcatctcgaaaaaacat t cgacgt cgaccacgaa 
113 VKALAEHTVGYRED PKLHLEKTFDVDHE 
2924 gaccttgttcttgcgaaagacattccattcaattctctatgtgagcatcatttagctccgttcgcagggaaggtgcatattgca 
141 DLVLVKDI PFNSLCEHHLAPFVGKVHIA 

3 008 cacattcctaaggataagattacaggtctttcaaaattcggecgagcggttgaaggatacgctaaacgacttcaagtacaagag 
169 YI PKDKITGLS KFGRVVEGYAKRLQVQE 
3092 cgcccgactcaacaaaccgctgacgctattcaggaagttctaaatccccaagcagccgcggtcatcgtagaggctgagcatact 
197 RLTQQIADAIQEVLNPQAVAVI VEAEHT 
3176 cgcacgagcggacgcggcattaagaagcacggggcaacgacagtgacctcaaccatgcgaggtcttttccaagatgacgcatct 
225 CMSGRGIKKHGATTVTSTMRGLFQDDAS 
3260 gctcgagcagaattgcttcagttgactaaaaagtag 3295 

253 ARAELLQLIKK* 
dplORF022 

3 OS 96 atgagtaaagacactcttcacggaaccaagctcgtgcaaatcgaggagcttgacccattgactcagttgccaaaagtcggcgga 

1 HSKDILYCZKLVQIBBLDP'LTQLPKVGC 

30980 gctaaccttgtcgtagatacggcagaaacagcagaactcgaagccgcgaccccggagggaactgaagatgtgaaacgcaatgac 

29 ANFVVDTAETAELEAVTSEGTEDVKRND 

31064 acgcgcactctegctatcgtgcgtactccagaccttttatacggttatgacttaacaetcaaggacaacacgtttgaccctgaa 

57 TRILAIVRTPDLLYGYDLTFKDNTFDPE 

31148 atcatggccctaattgaaggtggeacagtacgtcaacaaggcggaactattgctggatacgacaccccaaegcttgcacaaggt 

85 IMALI EGGTVRQQGGTIAGYOT PMLAQG 

31232 gcttctaacatgaaaccatttagaacgaacacctatgtgccaaactatgtaggtgactcaattgccaactacgcgaaaatcact 

113 ASNMKPFRMNIYVPNYVGDSIVNYVKIT 

31316 ttgaacaactgcaccggtaaagctccagggctttcaatcgggaaagagttctacgctcctgagttcaacatcaaggcacgtgaa 

141 LNNCTGKAPGLSIGKBFYAPBPMIKARE 

31400 gcaaccaaagcaggtttgccagtcaagtcaatggactatgtggcacaacttccagcggttcttcgtcgcgtgacattcgattcg 

169 ATKAGLPVKSMDYVAQLPAVLRRVTFDL 

31484 aacggcggaacaggaaccgccgacgcagttcgagccgaagcaggtaagaagatttctccaaaaccagctgaccctaccttaaca 

197 NGGTGTADAVRVEAGKKI S P KPVDPTLT 

31568 ggtaaggctttcaaaggctggaaagttgaaggagaatcaactatttgggacctcgacaaccacatgatgcctgaccgagacgcc 

225 GKAFKGWKVEGESTIWDFDMHMMPDRDV 

31652 aaactcgtagcacaatctgcatag 31675 

253 KLVAQFA* 
dplORP023 

6419 atggccaagcccaatttaactagaattgcaaagatggttagagcaggaaacagtgaaggtcctgcttcatcttttgtcaattcg 

1 M AKSNLTR I AKMVRAGMS BG PAS SFVNS 

6503 ctgacccgggttattgaacgaactcagcctgaatataacccttcgacatattataagcccagcggggttggtggatgtattcga 

29 LTRVI ERTQPEYNPSTYYKPSGVGGCXR 

65B7 aaaatgtacttcgaaagaatcggcgagtctactatagacaacgcagattctaacctaattgcaatgggcgaagctggaacacrt 

57 KMYFBRIGES I IDNADSNLIAMGBAGTF 

6671 aggcacgaagttctccaagagtacatggttaaaatggctgaaatcgatgaggactctgaatggttgaatgtagcagagttcttg 

95 RHEVLQEYMVKMABIDEDPEWLNVABFL 

6755 aaagaaaatccagttgaaggaaccaccgtcgacgagcgtttcaagaaaaacgattatgaaacgaagtgtaagaacgaacttctt 

113 KEHPVEGTIVDERFKKNDYBTKCKHELL 

6839 caactttcattcctgtgtgacggactagttcgatataaaggcaagctccacattttagagattaagactgaaaccatgttcaag 

141 QLS FLCDGLVRYKGKLYI LS I KTETMFK 

6923 ttcactaaacatactgagccctacgaagaacacaagatgcaagcaacttgctacggaatgtgtctaggagtcgatgacgtcatt 

169 PTKHTEPYEEHKMQATCYGMCLGVDDVI 

7007 ttcctttacgaaaatcgagataactccgaaaagaaagcctacacgtttcacaccacagacgagatgaaaaatcaagtccctgga 

197 FLYENRDNFEKKAYTFHITDEMKMQVLG 

7091 aaaattacgacctgcgaagagtacgtagagaaaggcgaaagccccaaaatctatcgctctccagcctattgcccatattgtaga 

225 KIMTCBEYVEKGESPKIYCSSAYCPYCR 

717S aaggaaggtcgaaatctgtga 7195 

253 K E G R N L • 
dplORPQ24 

2S992 atgaacgcagtagatggecaggtagttcatattctacaagtattagcagaagatggaaatgctacggctgaaaagttcgaaaag 

1 MNAVDGQVVHILQVLAEDGNATAEKFEK 

26076 gaagtcagggccgcacctttagcatttccacgaagagcagccgaggcagctgccaaaggtgaaatctataaggacggcaaaaac 

29 EVRAASLVFSRRAAEAVVKGE IYKDGKM 

26160 ctctcgaaacgtgtttggtcttcagccgcacgcgcaggaaatgatgttcaacaaatagtcacacaaggcctagcaagtggaacg 

57 LSKRVWSSAARAGNDVQQIVTQGLASGM 

26244 tctgctacagatatggctaaaatgctcgagaaatatatcgaccctaaggttcgaaaagattgggactttgataagacagctgag 

85 SATDMAKMLEKYIDPKVRKDWDFDKIAE 

26328 aagctagggaaacctgctgctcacaaatatcaaaatctcgaatacaatgcccttcgacttgctcgaactaccactagccactcc 

113 KLGKPAAHKYQNLEYNALRLARTTISHS 

26412 gccacagctggagcgagacaatggggcaaggctaatccttatgcccgaaaagttcaatggcattcegttcacgctccaggccga 

141 ATAGVRQWGKVNPYARKVQWHSVHAPGR 

26496 acgcgccaagcgtgtaccgattcagatggtgaagcatttcctatcgaagaatgtccctccgaccaccctaatggaatgcgccac 

169 TCQACIDLDCEVPPIBEC PPDHPM G-H C Y - 

26580 caaactgcatggcacgaaaactcacccgaagaaaccgccgacgagctgagaggctgggtagacggagaacctaatgacgtacta 

197 QTVWYENSLEEIADELRGWVDGEP-MDVL 

26664 gacgaatggtacgacgatttaagrccaggaaaagttgagaaacacagcgacctcgactttgttaaaagttattag 26738 

225 DEWYDDLSSGKVEKYSDLDFVKSY* 
dplOPJ025 

18778 atggcaaagaacaaaaagcgaaaaaaagtaaatgtcaaaaggaaaatgctcatccctacaaatctctcgaaaaaagcaaatgca 

1 MAKNKKRKKVNVKRKML1 PTNLSKKVNV 
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18694 aaagcaatcgcttacagaaaagccactgtcaagtggccgcctaacacagatgaaactcaagcacatttcgaccttcatataaac 

29 KAIAYRKVTVKW LPHTDBIQVYFDLYIN 

18610 aaaaacaggctgacaatgttaggcaccattgacccggacaagagctattttgaaggaattaggattgtttgtaagaaacctcag 

57 KNRLTMLGTIDPDKSYFEGIRIVCKKPQ 

1852S cctcggatgactgccaaggagctccaggttgcgcgcgcagacgccccaggtctctctgcagttcctaaagcctattgtcacacg 

85 pwmtvkelqvaradapgppavlkaycht 

184 4 2 gteggcgatgtactagatagcggagcagagcctactgaaattgttcaaggtattatgtataaagacggtgaactatttaaggac 

113 vgdvldsgaepteivqgimykdgelfkd 

1B358 agtgaaattgccagccttttcaaatacgatgccaaagagccttatgagtttccaaaggaccttcctataaccttggacaacttt 

141 SEIVSLPKYDVKEPYEPPKDLPITLDNF 

18274 tcagagtccactatgtctagccagcatactagagcacttgttttgcgttgtgctaatacaggtgagttttccaagaattggcgg 

169 LEFIMSSQHTRALVLRCAN IGEFSKNWR 

18190 aaatggcaaaaagccatccagctcctgctcgactatgccaaggcggacgaccctaaagtagacgaaactgtttgggacttttca 

197 KWQKAIQLLLDYAKAODFKVDETVWDFS 

18106 cccggctccaaagccggaaaggtagcacgtcgtaaaggctatgaggcaattcaacaagccctcgagcagataaataaataa 
18026 

225 PGSKAGKVARRKGYEAIQQALBQINK* 
dplOR?026 

21512 atggcgaaagctactggaccaaaagttcgaagaggaaaaactcctccacggccaaaagacaaaaaaggaatcaaagcaaatgcg 

1 MAKATGPKVRRGKTPPRPKDKKGIKANA 

21596 cgcgccaataaagaccagttcgtagagcatgactacaaaggcatcaagatgacaatcaaggaacgtgatgctagaatgaaattg 

29 RVNKDQFVEYDYKGIKMTIKERDARMKL 

21680 gaatttattagaggcatgactattcaggaaattgcagcccgctatggattaaatgaaaagcgtgttggcgaaatacgggctcgc 

57 EFIRGMTIQE IAARYGLNEKRVGEIRA R 

21764 gataaatgggt gaaggct aagaaagagt t cgagaatgaaaaggct ct t gt t act aatgat acat tgact caaatgt atgcaggg 

85 DKWVKAKKEPEMEKALVTNDTLTQMYAG 

21848 tttaaagcctcagtcaatattaaatatcacgccgcccgggagaaactaatgaacatcgtcgaaacgcgtttagataatcccgac 

113 FKVSVNIKYHAAWEKLMMIVEMCLDMPD 

21932 agatatttatttactaaagaaggaaatattagatggggcgcattagatgtcctttcgaaccttatagatagagctcaaaaagga 

141 RYLFTKEGMIRWGALDVLSNLIDRAQKG 

22016 caagaaagagcgaatggaatgct t ccggaagaggt t cgat at agac t acaaa t tgagcgcgagaaaatt acat tgc t ccgggcc 

169 QERANGMLPEBVRYRLQX EREKITLLRA 

22100 aaaatgggcgaccaggaaattgaaggcgaggttaaagataacttcgtagaagcactagataaagcagctcaagccgtttggcaa 

197 KMGDQEI EGEVKDNFVEALDKAAQAVWQ 

221B4 gaatttagtgacgcaacaggttcctacattaaaggagtgactgataatgacaataagcctgagaaataa 22252 

225 EFSDATGSYI KGVTDNDNKPEK* 
dplORF027 

52762 atgggaaaagtatcaattcaaaaatcaggaacatttagctcagggtctaataacgagtttttcacactcgctgaccacggtgac 

1 MGKVSIQKSGTFSSGSNNEFFTLADHGD 

52846 agcgcaattgtcactctattgtatgatgacccggaaggcgaagacatggattatttcgtagtccaegaagcagacgttgacggt 

29 SAIVTLLYDDPEGED'MDYFVVHSADVDG 

52930 cgtcgacgctatatcaattgcaatgctattggcgaagacggggaaacagtccatcctgataattgtccattatgccaaaacgga 

57 RRRYINCMAIGEDGETVHPDNCPLCQNG 

53014 ttccctcgtattgaaaaactatttcctcaactttacaaccatgatacgggaaaagttgaaacatgggaccgaggccgttcttat 

85 ppRiBKLPLQLYNHDTGKVETWDRGRSY 

S3098 gttcaaaagattgttacatttatcaataaatatggaagccttgtgactcagccttttgaaattattcgttcaggagctaaaggt 

113 VQKIVTFINKYGSLVT-QPFEIIRSGAKG 

S3182 gaccaacgaactacttatgaattccttccagagcgtccggaagacagtgctactcttgaagattttccagaaaagagcgaactt 

141 DQRTTYEFLPERPEDSATLEDFPBKSEL 

53266 cttggaactctaattttagacctcgacgaagaccaaatgtttgacgtggttgacggcaagttcactcttcaagaagagcgttct 

169 LGTLILDLDEDQMFDVVDGKFTLQEERS 

53350 tcaagtcgttcaaattcacgtagaggagcatctcctgcgcctagacgaggttccggtcgagaatcttcacaaggtcgaacagct 

197 SSRSNSRRGASPAPRRGSGRESSQGRTA 

53434 gaaagaactccttcagttagtcgaagaactcctccaacacgaggtcgaggattctaa 53490 

225 BRTPSVSRRTPPTRGRGF* 
dplORFOie 

44595 atgtcaaaaattaaattcgaaaaccttaaaaaaggcgatgttgtgctacgagctaaatctcaaacgaagtttaaaatcgtttca 

1 MSKIKFBNLKKGDVVLRAKSQTKFKIVS 

44679 attttagcagacgaaaagaaagcagaccttgaatcattagaagacggaggtgaacttcacctttcagcttcaactctcgaacgt 

29 ILADEKKADLESLEDGGELHLSASTLER 

44763 cggtacacaatggaagatgaaactgaacctaaaaaagaagaagctgctaaacctgctaaaaaggctgctcctgcagttgctcga 

57 WYTMEDETEPKKEEAAKPAKKAAPAVAR 

44 84 7 cctgctcgaaaaggtagagtcgttcccaaacctaaaaaagaagtccttgaggaagaaattcctgaagttaaggaacagccggaa 

85 PARKGRVVPKPKKEVLEEE I PEVKEQPE 

44931 gaagttggttcagttagtgagaaatctactgttcgaaaacccgctcctaaaaaagaaagcgtgatggcgattaetaaggctctt 

113 EVGSVSEKSTVRKPAPKKESVMAITKAL 

45015 gaaagt cgaa t tgt t gaagcct tt cctgcgt ct ac t cgaat cgt cac t cag t ct tacat cgc ct at cgc t c t aagaagaact t c 

141 ESRIVEAFPASTRIVTQSYIAYRSKKN F 

45099 gctactatcgaagaaactcgaaaaggtgtttctattggagttcgcgcaaaagggttgacagaagaccaaaagaaacWicttgca 

169 VTIEETRKGVSIGVRAKGLTEDQ K_ K L L A 

45183 tctattgctcctgcatcttacgaatgggcgattgacggaatttttaaactcgccaaggaagaagatattgacaccgcaacggaa 

197 S IAPASYEWAIDGI FKLVKEEDIOTAME 

4 5267 ttgaccgaagcttctcacctttcctcgctatga 4 5299 

225 LI BASHLSSL* 
dplORF029 
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662 atgaaatcagtagttttatcatccggcggagtcgacecagccacccgtctagcaattgaagcegacaagtggggttceaaaaat 

1 MKSVVLLSGGVDSATCLAI EVDKWGSKN 

746 gttcacgctatagcactcaattacggacaaaagcatgaagcagaacttgaaaacgccgccaacgtcgcaatgttctacggagcc 

29 VHAIAFNYGQKHEAELENAANVAMFYGV 

830 aagctcaccattcttgaaattgactcgaaaacctactcaagccccagcccteccttattacaaggaaaaggcgaaaccccacat 

57 KFTILEIDSKIYSSSSSSLLQGKGEISH 

914 ggaaaatcttacgctgaaatcctagcagagaaggaagcagttgacacctatgttccacccagaaacggactaatgctctcacag 

85 GKSYAEILAEKEVVDTYVPFRNGLMLSQ 

998 gctgcggcttatgcttacccggtcggagctccttacgccgcacacggcgctcacgcagacgacgcggctggaggtgcccaccct 

113 AAAYAYSVGASYVVYGAHADDAAGGAYP 

1082 gattgcactcccgagttctataattcaacgccaaacgcaacggaacatggaactggaggcaaggtaacccttgtcgctccccca 

141 DCTPEFYNSMSMAMEYGTGGKVTLVAPL 

1166 cttactctaaccaaggcgcaagccgttaaatggggaattgattcagatgttccttacttcctgactcgcccacgccacgaaagc 

169 LTLTKAQVVKWGIDLDVPYFLTRSCYES 

1250 gacgctgaaagttgtggaacttgcgcaacctgtaccgaccgcaaaaaggcattcgaagaaaatggaatgactgaccctattcac 

197 DAESCGTCATCIDRKKAFEENGMTDPIH 

1334 cacaaggagaattga 1348 

225 Y K E N * 
dplORF030 

20088 atgaataacgaaaaaattatcgaaaaaactaaaaatcttattcaattagcaaatgacaacccgagtgacgaagaggggcaaact 

1 MHHEKIIEKIKMLIQLANDNPSDBEGQT 

20004 gcccttcttatggctcaaaagttgacgctaaagaataatatcgcacttgctcaagttgaacaatttgatgaacctaaacagttc 

29 ALI»MAQKI»MLKNNIALAQVBQFDBPKQF 

19920 gagacctcccaagctgttgggaaagaagcaggtcgaatactttggtgggaacgtgaacttggtcacattctcgcgactaacttc 

57 BTSQAVQKBAGRIFWWSRELGHILATMP 

19836 aggtgectttgcactaatcagcgtgatacgcgcttgaataaaagtcgaataattttcttcggcgaaaaacaagacgctgaacta 

85 RCFCIKQROMRLMKSRI IFFGEKQDAEL 

19752 gtgtetaaaatatatgaggctgctctgctttatctccgttaccgtattgaccgacttcctactcgcgaaccctcctacaagaat 

113 VSKIYEAALLYLRYRIDRLPTRBPSYXN 

19668 tcacaccccaaaggctctttgtcagccttagccattcgatctaaaaagcaggtggaagaatatccacttatggtcctacctagc 

141 SYLKGFLSALAIRPKKQVBEYSLMVLPS 

19584 gagcaaacaaaaaacgcgcttcaggacacatttcgaaatttaaagaaggaaggaattgacagacctcaacatgacttcaacctc 

169 BQTKHALQDTFRNLKKBGIDRPOHOFirL 

19500 gaagcgcatattgaagggcggtttcatggcgagaacgcaaagatcatgcccgatgaaattttggaaggcggtaactaa 19423 

197 EAYIBGRFHGBNAKIMPDBILEGGN* 



dp 10W 031 

26943 atggcttatcaattagaagactcgctaaaaggtctagatgaaccaactatcaaacaggtgaaggaaattatttcgaaaaetccg 

1 MAYQLRDLLKGUDBPTIKQVKE1ISKTS 
27027 aaagaactcgatgctaaaatcctcatcgacggcgacggtcaacattccgcacctcacgcaegtttcgacgaagctgttcaacag 
29 KELDAKIFIDGDGQKFVPHARFDEVVQQ 
27111 cgcgatgcagctaacggcccaattaattcctacaaagaacaagtcgcgacgcttcctaaacaggtcaaagataaeggtgacgcg 
S7 RDAANOSINSYKEQVATLSKQVKDNGDA 
27195 cagaccactatccaaaaccetcaagagcaactcgacaagcagecccaactegcaaaaggcgccgtgattacttcagctctccae 
85 QTTIQMLQEQLOKQSQLAKGAVITSALH 

2 72 79 ccgc tgattagtgactccactgccccagcagcagacacccccggatc tatgaacct tgacaacat tacggtcgaaagtgacgge 
113 PLISDSIAPAADILGPMNLDNITVESDG 
27363 aaagttaaaggtcttgatgaagagcegaaagctgcccgcgagtcccgtaaacacccatteaaagaagtcgaagttcccgcagaa 
141 KVKGLDEELKAVRESRKYLFKEVBVPAE 

2 744 7 caagaggct caagccaagtcgccagccgggaccggaaat ttaggaaaeccaggt cgtgtcggcggtggtgtccccgaaccccgt 

169 QKAQAK SPAGTGNLGNPGRVGGGVPBPR 

27531 gaaaccggctcttctggtaagcaacttgctgcegcccaacaaacggcaggagcacaagaacaaccaccattctttaaataa 

27611 

197 BIGSFGKOLAAA<J<?TAGAOEOSSFFX* 
dplOR7032 

52033 atgaaagaagcgaacagactagtccccagceatgtaggactcgaacgctggactgacgaagaatgtatcaggaactctgaacta 

1 MKBANRLVSSYVGFECWTDBECIRKPEL 

52117 gaccctgatatgtcaacegcgcctgcctatcaccgttatcctgggatgctctattcctatgcaaaaaggtttaaatgctcatct 

29 DPDMS1ASAYKRYFGMLYSYAKRFKCLS 

S2201 cgacaegacattgaaagcattgcacccgagaccacttcaaaacgtttggcaacgttcaaatcaaaccaaggggccaagcttcca 

57 RKDIESIAFETISXCLATPKSMQGAKFS 

S22S5 acctaccccacaagacccttcaagaatagaatagccttagaatataggtacctaaacgcaccctccatgaaccgaaatcggtat 

85 TYLTRLFKNRIVLEYRYLNAPSMMRNWY 

52369 gcagaagcgacgttcgatagcgcctcgacaaacgaagaaggcgacgatcttagtaccctatcgacagttggccattgcgaagac 

113 VEVTFDSVSTNEEGDDFSILSTVGYCBD 

52453 tacggaaaaatcgaaaccgaagcaagtcctgactccacgacgcttcccaatacagagtatgctcatatcccgtctgtcattcaa 

141 YGKIB18ASLDFMTLSNTBYAYISSVIQ 

52537 aacggtccttcagcaagcgacgcagaaattgcgcgtgaaaccggagtaagcaggcctgctattagtcagcctaagaagtcacta 

169 NGPSVSDAEIA REIGVS RSAISQSK_lfr-SL 

52621 aaaaaeaaattaaaagaetteatataa 52647 - 

197 KNKLKDFI * 
dplORF033 

7670 acggcaagacctaagttacctcaaatcgatattcgagaagaagaaatacgagatgctcaagacgtagcagactcgtatggcgcg 

1 MARPKtPQIDIREEEIRDAQDVADSYGA 

7754 attatcaataaagtagtcgacgaaattgttgaagcagcttgcggttcacctgaccaggcaatggaagaaattcaaatagttgta 
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29 IltfKVVDEIVEAACGSLDQAMEEIQIVV 

7839 agccaaaatcctgtcattacggaagacctcaaccactacactggctatcttcccactcttccctatttcgccgcagatagggcg 

S7 SQNPVIMEDLNYYIGYL.PTLLYFAADRA 

7922 gaaatggtgggaacacaaacggattcaagtcctgctatcaggaaagaaaaacacgataatctatacactttagccgccgggaaa 

85 EMVG2QMDSSSAIRXEXYDNLYILAAGK 

8006 actattcctgacaagcaagcagaaacecgaaaacttgtcatgaatgaagaagtcatcgaaaatgcttacaagcgagcctacaag 

113 TIPDXQAETRXLVMNEEVIENAYKRAYK 

8090 aaagttcaattaaagctagaacaggccgaeaaggtattagcatctttaaaacgaattcaaacctggcaactagcagagttagaa 

141 KVQLKLEQADXVLASLKR I QTWQ LAELE 

8174 act cage caaat aat t caaaaggagt ac t a c t aaatgcaaaaagacgt agacgtgaaaatga t tga 8239 

169 TQSNWSKGVLLNAKRRRREND- 
dplOR7034 

131 acgagtcaaaacactacacgcactgacgctgaatcgacaggcgttacccttttaggaaaccaagacaccaaatacgactatgac 

1 MSQNTTRTDAELTGVTLLGNQDTKYDYD 

215 tataatccagacgcccttgaaactccccctaacaaacatcctgaaaacaattacctagtaacacccgacggacacgaactcact 

29 YNPDVLETFPNKHPENNYLVTFDGYEFT 

299 tccctttgccctaaaacaggacagcccgactccgcgaatgttctcattagttacantccaaacgaaaagatggctgaatctaaa 

57 SLCPKTGQPDFANVFISYIPNEKMVESX 

383 tcatcgaaattgcacttacccagcctccgtaaccacggtgacttccacgaagattgcatgaacattatcttgaatgacttgcat 

85 SLKLYLFSFRNHGDPHEDCMNI ILHDLY 

467 gaaccgatggaacctaagtacatcgaagecatgggcctattcactcctcgtggtggaatttcaacctacccattcgtcaacaaa 

113 ELMEPKYIEVMGLFTPRGGIS IYPPVNK 

551 gtgaatcctcaafcttgcaactcctgaacttgaacagcttcaacttcaacgcaaattgaacttccteggaaatgttcaaggtcet 

141 VNPQFATPELEQLQLQRKLNFLGNVQGL 

£35 ggacgagctatttcgatag 652 

169 G R A I R • 
dplORP035 

17425 atgcacctaatgaaggattcgaagatgttgaggacatggaagtccttagcattcgagttcgaaacgaaggtgaggacgacgagt 

1 MHLMKDSXMLRTWXSLAFEPETXVRTTS 

17341 gggttgaagttatcgcccgctacgaaaacgatgacgaggacgaagatttggaagggctataaaatgaaggtatttatcaacaat 

29 GLXLSPAMXTMTRTKIWXGYXMXVFINN 

17257 catactgaagctgatatcgactacaaagatattccaaattttgtagcttatcgaaactctcctaaccctcaaattcaaatcact 

57 HTEADIDYKDILNFVAYRNSPNPQIQIT 

17173 agccggaacgccttgctccccegctatacacggaatgagctttcttataaaggagtttcaacaacggacttttttgaagccatt 

85 SWNALIiSCYTRNELSYXGVS I TDFFEAI 

1 70 89 caaact at tgcaagtt c ct tcact cacct agactcgaaaacaat cgat aeacaaaatgaaaagegact cgaaaggat t gaggaa 
113 QTIASSFTHL.D5KTIDTQNEKRLERIEE 
17005 cttcagtcaagaataggtcattgtaactgtactatcgacgaacttaaaaaaggagtccacgaaatgccggatattgaatcagct 
141 LQSRIGHCNCTIDELKXGVHEMPDIESA 
16921 atttcttaccagtacggacagattcttgctcatgaagatgaacttaattttctgctaaacCaa 16859 

169 I SYOYGQI tiAYEDELNFLLM * 
dplORF036 

48BQ8 gtgttagccgaacgaaaagccgacaaggaatgttgggaatggctagaagctgttcgagcaaacatagtcgaagaagttcgaaac 

1 VLVBRKADKECWEWL8AVRANIVEBVRN 

48892 ggtcctagcactgtcattgcrtcgaatactgtcgggaatgggaaaactagctgggcggctcgacctttgcaacgctatttagca 

29 GLSIVIASWTV GMGKTSWAVRLLQRYLA 

48976 gaaactgcacttgacggaagaattgctgagaaaggaatgtttgtagtgteagctcaaccatcgactgagttcggcgactataat 

57 ETALDGRIVEKGMFVVSAQLLTBFGDYN 

49060 tatttccaaaccatgcaagaatttcecgaacgtctcgagcgccttaagacttgtgagctattagtcatagacgaaataggcgga 

85 YFQTMQEFLERFERLKTCBLUV1DE1GG 

49144 ggtcccccaaccaaggccccCtatccttatctgtatgactcggtcaattatagggttgacaacaacrtgtcgactatttatacg 

113 GSLTKASYPYLYDLVNYRVDNNLSTIYT 

49228 accaattatactgacgacgaaattattgaccctttaggccaaaggctttatagtcgtatatatgatacttcagtggttctagat 

141 TNYTDDEIIDLLGQRLYSRIYDTSVVLD 

49312 ccccaggcaagcaatgtaagaggattggaggtaagcgaaattgaatcatag 49362 

169 FQASMVRGLBVSE1ES* 
dplOR7D37 

S5855 atggtgaagaaactgaaatctaaaatctatccagctgcatatacaattctagtagttattgcgaaccttgtgacaatiteatttc 

1 MVXKLKSK1YSVAYI1LVVIANLVTIYF 

55939 gaacccttaaacgcgaaaggaatcttaattcctccaagcagttggtttacgggattcactcccctgcttataaatctaataagc 

29 BPLNVKGILlPPSSWFMGFTFLlilNLIS 

56023 aagtacgagaagccaaaatttgcaggrtctttgacacgggcagggrcattcctcacctcgctgacttgcttcacgcaaaaccta 

57 KYEKPKPAGSLIWVGLFLTSLICFMQNL 

56107 ccacaatcgcttgccgcggctccaggagttgcattttggacaagtcaaaaagcaagtgtctctatattcgacaagccctcgaac 

85 PQSLVVASGVAFWISQKASVFIPDKLSN 

56191 aaattagactcgaagactgcaaatgctttgtccagcaacatcggttctattatagacgcaaccatatggattccattaggactg 

113 KLDSXIANALSSNIGSI IOATI HI SLGL 

56275 agtcctcctggaattggaacggtcgcacatatagatattccgtcagccgtactaggccaagctccagttcagtttatcttgcag 

141 SPLGIGTVAY1DIPSAVLG0VLVQ FIL_0 

56359 tcaattgcttcgagatattcgaaaaagcag 56388 - — 

169 SIASRYI.XK* 
dplORPOSS 

1350 atgagagtttccaaaaccccaacattcgacgcagctcatcaactagctggacattttggaaaatgcgcaaattcgcacgggcat 

1 MRVSKTLTFDAAHQLVGHFGKCANLHGH 

1434 actcacaaagccgaaattccaccagcaggcggaacttatgaccacggttcgagtcaagggacggtcgttgactttcaccacgtc 1 

29 TYKVEISLAGGTYDHGSSQGMVV DFYHV 
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1S1B aagaaaatcgcaggtacattcattgacagacttgaccacgctgttcttcttcaagggaatgaaccaatcgctttagcaaatgca 

57 KKIAGTFI DRLDHAVLLQGNE P I A L A N A 

1602 gttgacaccaagcgagttctatttggacttagaactacggctgagaacacgccaagattccccacctggactctcacggagctt 

85 VDTKRVLFGFRTTAENMSRFLTWTLTEL 

1686 atgtggaagcatgctcgcaccgaccccaccaaaccatgggaaactcccacaggttgcgcagaatgcacttaccacgagatttcc 

113 MWXHARIDSIKLWETPTGCAECTYYEIF 

1770 acagaagacgagatcgaaatgttcaagaacgtaacctttaccgacaaagacgaaaagattactgtccgcgaaattttagagcag 

141 TEDBIEMFKNVTFIDKDEKITVREILEQ 

1854 gagcaggataatggttaa 1871 

169 E Q D N G * 
dplORP039 

3306 atgaataaaagcgcaaccttttggctcgtccgaacagctcttattgcggctctatatgtgacatcgaccgttgcattttctgct 

1 MNK5ATFWLVRTALIAALYVTLTVAFSA 

3390 attagttatggacccactcaatttagagtcagtgaagccctgattcttctacctttatggaaccacagatggaccccggggatt 

29 isygpiqfrvsealillplwnkrwtpgi 

3474 gcattaggaacaattatcgcaaacctcttttcacctcttggactgattgacgttctattcggtccacttgccaccttccttgga 

57 VLGTIIANFFSPLGLIDVLFGSLATFLG 

3558 gcagcggcaacggtgaaagttgctaagatggcaagtcctctatattcacttatccgtccagttcttgctaatgcttaccttatt 

85 vvamvkvakmas plysl ic p vlanayli 

3642 gcgctggaacttcgaacagtttaccccctacctttctgggaatctgtcatctatgtaggaatcagtgaagcgattatcgtttca 

H3 ALELRIVYSLPFWESVIYVGISEAIIVL 

3726 atttcatacttccttatttccacgctggcgaagaacaaccactttagaacactgataggagcgaaaaatgggatttaa 3803 

141 ISYFLISTLAKNNHFKTblGAKNGI * 
dplORF040 

7192 gtgagctatactggaaaaatgtccgaggaagactttttcgaaggtgcaaaagactttgagaaagatgctttcacggtccgtcta 

1 VSYTGKMFEEOFFEGAKDPBKDAFTVRL 

7276 tatgataccactaatggatttcgaggagctgcaaatccctgcgattatatagccgcaactaactctgggacctcgtttattgaa 

29 YDTTNGPRGVANPCDYIAATMFGTLFIB 

7360 ctgaaaactactaaagaagcttctttgagctttaataacatcactgataatcaatggttccagctatcacgcgcagatggatgc 

57 LKTTKEASLSFNNITDNQWFQtSRADGC 

7444 aaatttattctcgccggaattttagtgtatttccaaaagcatgaaaagattatatggtatccaatttcaagccttgaaaaaatt 

85 KFILAGILVYFQKHEKIIWYPISSLBKI 

7528 aaacggtctggagttaaaagcgtcaacccaaacttcatcgatgcagggtatgaagtttcttacaagaagcgtcgaactagattg 

113 KRSGVKSVNPNFIDAGYBVSYKKRRTRL 

7612 accatecctttccaaaatgttetagatgcagttgagcttcattacaaggagaaaagcaatggeaagacctaa 7683 

141 TIPPQNVLDAVELHYKEKSSGKT* 
dplORF041 

8208 atgcaaaaagacgtagacgtgaaaatgattgaccctaaaettgaccgattaaaatacacaggtgattgggttgatgtacgaatt 

1 MQKDVDVKM1DPKLDRLKYTGDWVDVR1 

8292 agttccaccactaaaattgacgccgacagcgccgatgtctcaagatgccgaaaagtgctccaaaaggctcaagtaCattcagtg 

29 SSITKIDADSADVSRCRKVLQKAQVYSV 

B376 gcggcaggtgaatgcattaaaattgcacacggatttgctcctgaacttcctaagggatatgaagcaatcttgcatcctcgttcc 

57 AAGECIKIAHGFALBLPKGY8AII.HPRS 

8460 agcctctttaagaaaactggtctaatcttcgtttctagcggagtgattgacgaaggttacaaaggtgacactgatgaacggttc 

85 SLPK.KTGLI FVSSGVIDEGYKGDTDEWF 

8544 tcagtctggtatgccactcgtgacgcagatatcttccacgaccaaagaatcgcccaatttagaactcaggaaaagcaacctgct 

113 SVWYATRDADIFYDQRIAQFR IQBKQPA 

8628 atcaagttcaatttcgtagaatctttaggaaatgcggctcgtggaggccatggaagtacaggtgatctccaa 8699 

141 IKFHFVESLGNAARGGHGSTGDF* 
dplOBF042 

48082 gtggcaaggcaaagaataggcaaEtcaggaaagcetaaaaatgaaattgaactaacattcaaagacaagcctaaaactcgttct - 1 

1 VARQRIGNSGKPKNEIELTPKDKPKTRS 

48166 accttattcaagaaggacgtggcaacaggtctttcaaaagtcgagcatgattattttcaaatagttgaagcacttaacggaaaa 

29 TLPKKDVATGLSKVEHDYFQIVEALKGK 

48250 caact cgaacceaatatgaagcaggc gt cat ct c t ct 1 1 at agt t cagt atgaa cttactttcaac at t aagtgcatcgact at 

57 QFEPNMKQVSSFFIVQYEFZ FNXKCXDY 

48334 aactggttcaacttttcgagcaccatgaaaaatgttcgaacttatttaaacactgagtcgaacattgaactttgtcgattttta 

85 NWFNFSSTMKMVRTYLN1ESNI BLCRFL 

48418 gctgaaagttttgttaaatatgaaaatgttcgaaaaagattgaacctaagcgaaaggttcataacggtctcgactttcaaaaga 

113 AESPVKYENVRKRLNLSERFITVSTFKR 

48502 gcctggattttggacgaactcgaaggaaaaacgggtccaaaattcgaaggattttattag 4B561 

141 AWILDBLEGKTGSKFBGFY* 

dpio»043 

31699 atgactaatattatcacagctgagcagtttaagcaacttgcatttcaaatcatcgcacttccaggattttcaa&aggtagtgaa 

1 MTNI ITAEQFKQLAPQI IALPGFSKGSE 

31783 cctatccatgttaaaattcgagcagcaggtgtcatgaacctaatcgctaacgggaaaacccctaatacgcttttaggtaaagtg 

23 PIHVKIRAAGVMNLIAHGKIPNTLLGKV 

31867 acagaactgtttggagaaacttcgacagtcactaaagacaatgctagtctagcatcaattactgaccaacagaagaaagaagcg 

S7 TELPGETSTVTKDMASL AS I T D QT Q K_JC*~"E A 

31951 ctcgaccgattgaacaaaaccgataccggtattcaagacatggctgaacttcttcgagtattcgcagaag«ttcaatggtagag 

85 IiDRLNKTDTGIQDMAELLRVFAEASMVE 

3203S cctacttacgctgaagtcggcgagtatatgacagatgagcaacttatgacaatcttcagtgcaatgtacggtgaagtgactcaa 

113 PTYAEVGEYMTDEQLMTIFSA MYGEVTQ 

32119 gctgaaacctttcgtacagacgaaggaaatgtctaa 32154 

141 AETPRTDEGNV* 
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dplORF044 

25666 atggtaagtgtttegattagcagcagctcctttetgaagttcctgcttcattttagctcgacaagcatttctaaatcgaataag 

1 MVSVLISSSSFLKFLLHFSSTSISKSNK 

255 92 gttttcaatttccttgtttcctacataagtggtgaaccgataatggcacttaggacattcgaagaatctccactctacgccctt 

29 VFNFLVSYISGEPIMALRTFEESPLYAt. 

25498 ttcgatatgtttcgaaacaatctgtttagatgtaaggtcgaacttatgctcacaacggtcacaattaaccttgaacgtctgggt 

57 FDMFRNMLFRCKVELMLTMVT INLERLG 

25414 cgactccttcttcggctggttgctcagtttgttcttttcctttgtcatcaactccgtcttcttcactcgtttcatcttgaggcc 

85 RLLLRLVVQFVLFLCHQLRLLHSFHLEA 

25330 cctcttgtccgrttaactcgtttgctaatacaggcaacgccccagctgagattccgtcaagctgagcaagttcttccaaaatgc 

113 PLVRLI RLLIQAMLQLRFRQAEQVLPKC 

25246 gttcccattccttgtccgccttttecttcteactga 2S211 

141 VPIPCPPPPSY* 
dplORF04S 

25340 atgaaacgagtgaagaagacgaagttgatgacaaagaaaaagaacaaactgaacaaccaaccgaagaaggagtcgacccagacg 

1 MKRVKKTKLMTKKKNKLNNQPKKESTQT 

254 24 ttcaaggttaattgtgaccattgtgagcataagttcgaccttacatctaaacagattatttcgaaacatatcgaaaagggcgta 

29 FKVNCDHCEHKFDLTSKQ I I SKHZEKGV 

25SOS gagtggagatecttcgaatgtcctaagtgccattaccggttcaccacttatgtaggaaacaaggaaattgaaaaccttattcga 

57 EWRFFBCPKCHYRFTTYVGNKEISNLIR 

25592 tttagaaatacttgtcgagctaaaatgaagcaggaacttcaaaaaggagctgctgctaatcaaaacacttaccattcatatcga 

as FRN'TCRAKMKQ E LQKGAAANQMTYHS YR 

25676 attcaggatgagcaagctgggcataaaatctcagggcttaeggcgaagctaaagaaggagacaaacattgaaaaacgagaaaaa 

113 IQDEQAGHKI SGLMAKLKKEINIEKREK. 

25760 gaatgggtatccatacag 25777 

141 B W V S I * 
dplORF046 

42774 atgccaatgtggctaaacgacacagcagtcttgacgacgattattacagcgtgcagcggagtgeetactgtcctactaaataag 

1 MPMWLHDTAVLTTI ITACSGVLTVLLNK 

42858 ttattcgaatggaaatcgaataaagccaagagcgttttagaggacatctccacaacccctagcactcttaaacagcaggtcgac 

29 LFBWKSNXAKSVLEDISTTLSTLKQQVO 

42942 gggactgaccaaacgacagtagcaatcaatcaccaaaatgacgccattcaagacggaactagaaaaacccaacgttaccgtctt 

57 GIDQTTVAINHQNDVIQDGTRKIQRYRL 

43026 tatcacgacttaaaaagggaagtgacaacaggccacacaactcccgaccattttagagagcectctattttattcgaaagttat 

85 YHDLKREVITGYTTLDHPREXiSIZjFBSY 

43110 aagaaccttggcggaaatggtgaagttgaagccttgcatgaaaaatacaagaaattaccaattagggaggaagatttagatgaa 

113 KMLGGMGEVEALYBKYKKLP I RBBDLDB 

43194 actatctaa 43202 

141 T I ■ 
dplOJW047 

47542 atgaaatttgaagatgaaaaacagttcatcgctgcaatcgaagaagccggcgaaccaaatgccaccaaaggcgacatggagaaa 

1 MKFEDBKQFIAAIEEAGELMATKGDMBK 

47626 caagtcaaaagtcttcgtgatgctctaaaagagt acatgaaagaaaatgacattgaatctgctcaaggtaagcacttttctgct 

29 QVKSLROALKEYHKENDX ESAQGKHFSA 

47710 accctctacacgacagagcgctcaactacggacgaagaacgcttgaaagaaactatcgaaaaatcagttgacgaagccgagacg 

57 TFYTTBRSTMDEERLKSIISKLVDBABT 

47794 gaagaaatgcgcgaaaaactttcagggcttatcg&atacaagcctgtcaccaatacgaaacctctcgaggatatgacttatcac 

85 EBMCEKLSGLIEYKPVINTKLLEDMIYS 

47B78 ggcgagattgaccaagaagcaattcttccagcagttgtcacttctgttacagaaggcactcgttctggaaaggctaaaatttag 

47961 

113 GEIDQEAILPAVVrSVTEGIRFGKAKl* 
dplORF048 

16709 atggaaacaacactttatttcggttatcttacagcagattggaaagacggtcacaagaaccacactttccaceacgaaagcatt 

1 METTLYFGYLTADWKDOHKMYTFHYBSI 

16625 cctgcaaaagaaactgagaaacaatataaggtcaccggaaccaatcccaacttgcacttagacccaggctcagctattagaaag 

29 PVKBTBKQYKVTGINPNLYLDLGSVIRX 

16541 agcgaaettgacattgcagtattcaaagcatgtcetgtcgetgaaactggagtcacacttactcgcgacatggaagttgatgct 

57 SBLDIAVFKACPVAETGVT LTRDMBVDA 

16457 agaatcgaaatcatcaagaaattaactacaagaaccgaacgcctcaacgaaagaatcaaagcaagaaacgaacaaggtaaacaa 

85 RIEIIKKLTTRIERLNBRI karneqgkq 

16373 gaaagccgccacctagtatctgcgctagaagatcgcgctcgtcaaattgctggaatttatcaataa 16308 

113 bsrhlvsalbdcarqiagiyq* 

dplORP049 

44018 atgtttcaaccatttctcagcgagcatgtagccttggtcgccaaagtagaaccaagacttgttttcttcgatatactcgaactc 

i mfqpplsbhvalvvkvbprlvpfdi lbl 

43934 atcttttggataagttcegtttgctcgagcgtaccagaaaccagtagcatctttctgccagccaagtttcttctcagccggttg 

29 IFWISSVCSSVPETSSIPLPAKFLLSRL 

43850 agcatttgcgttagtcaagcgatagacgtagtagtaaggttgacctgcacagtaccaacgctcatcgtggtcgttgacggaaat 

57 SICVSQAlDVVVRLTCIVPTLIVVV.Oq.N 

43766 tccgtcgtaggcgtagttgcagtgaacgatgttatcactgtcaatgaacatccctgtacgacctccagcgcccacgctagcacc 

85 SVVGVVAVHDVITVNEHPCMTSSA-CAST 

4 3682 tttgcgtccccagatgaagatgtcgcctcgtttagcaccccacggagcatettcactaatttag 43620 

113 FASPDBDVASFSIPRSIPTN* 
dplORFOSO 

15081 atgaacaatcagcgaaagcaaatgaacaaacgaatcgccgaacctcgcgaagaccatcaacgtgcaagaggtcgaataaacttc 

1 MNNQRKQMNKRIVELRBDYQRARGRINF 



WO 00/32825 



PCT/1B99/02040 



379 

15155 cttcccgccgcaaaggaccacggcgaagaactcgaaaacctcgaagccctcgtgggacacactgacaatctagccgaatgttcc 

29 LLAVKDHGEELENLEAFVGYIDNLVECF 

15249 cctgaaagccaacgaaatgccttgaggccacgcgtatcagatgaccttccagccactaatgcggccgccgaaactggataccac 

S7 PESQRNVLRLCVLDDLPVTNAAAEIGYH 

15333 tacacatgggctcaccaacttcgagacaaagcagccgaaacacttgaagaaattctagatggggataacatcactcgctctaaa 

85 YTWVHQLRDKAVETLEEILDGDNIIRSK 

15417 cacggaatcgaaactaaggagaaacttgatgaattatatggtaaaagtcattctagttag 154 76 

113 HGIEIKEKLDELYGKSHSS* 
dplORFOSl 

29765 atgagttatgacgtgaattatgttaagaatcaagttcgtagagccattgaaaccgctcceactaaaatcaaggtacttcgaaac 

1 MSYDVNYVKNQVRRAIETAPTKIKVLRN 

29849 tcetgggtcagCgaeggaeatggaggaaagaaaaaggaeaaagegaatgaagtcgcagcagacgaccttgtttgtttagttgat 

29 SWVSDGYGGKKKDKANEVVADDLVCLVD 

29933 aactcaaccgtccctgaccttttagccaattctactgacgcgggaaaaacttttgcccaaaatggagtgaaaactttcattcta 

57 NSTVPDULAKSTDAGKIFAQMGVKIFIL 

30017 tatgacgaaggcaaaaecattcaacgagccgatactatcgaaattaaaaactcaggaagacggcacagggtagtagaaacccac 

85 YDBGKIIQRADT1BIKNSGRRYRVVBTH 

30101 aatcttctcgagcaagacattttgatagaacttaaattggaggegaacgaccaa 30154 

113 NLLEQDILIELKLEVND* 
dplORF052 

30516 atgactaaacgaacgacaacgatggacagatcgaaggaaatttcttcccacatttcagctctcgcctgctcctatgcttccagga 

1 MTKRTTMMDRLKEI LPTFQLS PAPMLPG 

30600 gctgaatctgacgagcaagacacagataggccggatgactacattgtccttcgatatagtcacagaacgcccagcgcaacaaat 

29 VEFDEQDTDRPDDYIVLRYSHRMPSATN 

30684 agcctaggaagttttgcttattggaaagttcaaatctacgtccattcaaactcaattattggtatcgacgaacatagcagaaag 

57 SLGSFAYWKVQIYVHSHS I IG I0EYSRK 

30768 gttcgaaacactatcaaggacatgggctacgaagtaacctatgcagaaactggtgactacctcgacacaatgctttctagatac 

8S VRNI I KDMGYEVT'YAETGDY FDTMLSRY 

30852 cgactagaaatcgaatatagaattccacaaggaggaaactaa 30893 

113 RLEIEYRIPQGGN* 
dplOKF053 

50300 atgctaacattcgaaagaatagtatctatacgagcaccaacttgcacttcactcacttccccgccatatagaaggacaccatgc 

1 MLTFBRIVSIRAPTCI SLISPLYRRTSC 

50216 ccgcccttccaagcagttgcaagcattttatcaatagtccacgactcacctcgcccaggccgagccattatgacaatcaaaecc 

29 PFFQAVASILSIVHDLPCPGRAIKTIKS 

50132 tcaccaggaagtaagcctccaagcacgtcgtccaacagttcaaaccccgtcgatattccaagcctttcaccgtcatggcttcta 

57 S PGSKPPSTSSNSSNPVDIPSLSPSWFL 

5004B atagtattcgcccagtctagtcgaagtttagcatttcgagcaatgtctagtccgcctacgaacttagagcgattgaaaagttct 

85 1VFAQSSRSLAFRAMSSPPTNLBRLKSS 

4 9964 tccagttttggaattatattcgcaatcgcaatgttactatctacttga 49917 

113 55FGI IFAIAMLLST* 
dplOR7054 

14423 atgtgtgaaaattgtcaaaacgaaacattcaatactagaattctcaatgaagatgaaagtggctatgtcgacgcctcactcact 

1 MCBNCQNETFHTRIFNBDBSGYVDASFT 

14507 tacaaggagactcgcgacaccgcagcagqtattagcaatcgagcggtagaaaagaaagaccgtgacagccttttagtcgctaca 

29 YKEIRDTAAAISNRAVEKKDRDSLLVAT 

14591 gttatggctcttcccgtttctcacgcagaagacttaggcaagagactttgtattgcaaactctcgattggaagcattccgtgaa 

57 VMALPVSHAEDLGKRLCIANSRLEAFRB 

14675 gctgctcaagaggccctcgagaatgaaaaggctgaagatttaaaggacgttatcttaggtctcatcgacgccgacaaaaaaatt 

8S AVQSALBKEKAEDLKDV ILGLIDVDKKI 

14759 ggcaaccttgcattgcaattagttgaatcaggagcattataa 14800 

113 GNLALQLVESGAL* 
dplORFOSS 

27627 atgcctaatgtgcgagt t aagaaaact ga tt t taat caaaccac t cgaagcat tgtcgcaat t cct gaccact acgt tgct ccg 

1 MPNVRVKKTDFNQTTRS IVAI PDHYVAL 

27711 gctgctcaaatcccagctaccgcagcaactcaagtagggaacaagaaatacattcttgccggaacttgcgtgaaaaatgctact 

29 AAQIPATAATQVGMKKY I LAGTCVKNAT 

27795 acatttgaaggacgcaaaactggactcgaagtagtatctaccggtgaaeaatccgacggagtcatcttcgccgaccaagaagcg 

57 TFBGRKTGLEVVSTGEQPDGV1FADQEV 

27879 ctcgaaggtgaagaaaaagtaaccgtgacagtattagttcacggattcgtcaaatatgcagcccttcgaaaagttggcgatgct 

85 FSGSBKVTVTVLVHGFVKYAALRKVGDA 

27963 gtgcctgaatctaaaaacgcaatgattcttgtcgttaaacag 28004 

113 VPBSKNAMILVVK* 
dplORF056 

19151 atggaaaacaaat ggaaagt t atccat 1 1 1 caaaact catgtat t aaacaagc agacgatgaaaaaaggaggc t cctgt tcgaa 

1 MEMKWKVIHFQNSCIKQVDDEKRRLLPE 

19067 gttccaggaaccccttaccgtctacaagttcgggtgaaaatgagcttagttaaaattgaaacacgcgcaggaaacggctattat 

29 VPGTPYRLQVWVKMSLVKI ETRAGNGYY 

16983 aaaaggctagtatgccaagacgattttgtatttcatggtaaggagtcaatagatggttacttaattgacgccaccataactggc 

57 krlvcqddfvfygke s z ogyli da^t^itg 

18899 aaacccttggcggaataccgcgagcctacgaacaggcatattctcgaaaccactgcaccgcgagaagcagccgaactgaacaga 

85 KSLAEYCEPMNRHI LETIASREAAELNR 

18815 gctaaaaagcaagaccaacagaaacggagatactag 187QO 

113 AKKQDQQKWR.Y ♦ 
dplORF057 
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9859 atgcaaaaatctctatttggacctaagccagtgcctgetagttcaaggcgcaagaaaagaacggctccaaaacctaaacctaaa 

1 MQKSLFGPKLVPASSRRKKRTVPKPKPK 

9943 atcgacgagcaagtggtcgagcttatgaaccgcagagagcgccaagtgctegttcacagttgcatctattattattttaatgac 

29 IDEQVVELMNRRERQVLVHSC I Y Y Y F N D 

10027 tcaattatagcagacgggcagtatgacaaatggagccacgaactatattctcttatagcttcgcaccetgatgagtttcgacag 

5? SI IADGQYOKWSHELYSLIVSKPDEFRQ 

10111 actgttctctataacgagtttaaacagtctgacggaaacactggaacgggtctcccacacgaccgtcagttcgctgtaagggtc 

85 TVLYNBFKQFDGNTGMGLPYDCQFAVRV 

1019S gcagaaaggcttttaagaaaatga 10218 

113 AERLLRK* 
dplORF058 

15633 atgacaccacgcgcatacaaaccaattcccacgcgcagagctagtgctaaacaagagaaggcagttgctaagcagttgggagga 

1 MTSRAYKPI PTRRASAKQEKAVAKQLGG 

15717 aaagcacagcctaattcaggagccactgactaccacaaaggcgacgtcgtaacagactcaatgcttatagaatgcaagacagte 

29 KVQPNSGATDYYKGDVVTDSMLI ECKTV 

15801 atgaagccacaaagttcagtcagcttgaaaaaggaatggtecccaaaaaatgaacaggaaaggtccgctcaaaaactcgactat 

57 MKPQSSVSLKKEWFLKNEQERFAQKLDY 

1S88S tctgctatcgcctecgaccttggtgacggaggcgaacagtatacagcaatgcctataagtcagttcaagcgaatattagaggat 

85 SAIAFDFGDGGEQYIAMSISQFKRILED 

15969 agaaatgataaccetatttaa 15989 

113 R M D N L I * 
dplORF059 

30154 atgcctcagcctgaattagtatggaagcctgaagaatttgttagcaactgtgaacggcatcgaaacaagcttcaagtcgctgtc 

1 MSQPELVWKPEEFVSNCBRYRNKFQVAV 

30238 ac aacagt e t gcgaagt cgc cgc cacc aagatggaagaatacgcaaagaegcatgct at t tggacagaccgt acagggaatgc t 

29 ITVCEVAATKMEEYAKTHAIWTDRTGNA 

30322 cgacagaaactcaaaggagaagecgcttgggtaagcgcagaccaaaccacgatagctgtatcacatcacatggactacgggttt 

57 RQKLKGBAAWVSADQIKIAVSHHMDYGP 

30406 tggetagaaceagetcatggtcgaaaatacaaaactcccgaacaggccgtagaagacaaegtcgaagaactttttagagcgetg 

85 WLELAHGRKYKI LEQAVEDNVEELFRAL 

30490 agaaggttattagactag 30507 

113 R R L L D * 
dplORFOfiO 

38070 gcgatagctgtacctgctatccctactccgctctttccaggtacaccgtcgactccatcacgcccaggagctcccggtaaacct 

1 VIAVSAIPTPLFPGTPSTPSRPGAPGKP 

37986 gcgtcacctctaggaccctctagtcgaatccacgcaaagtcgtcaggaactaattcgctcggtttcttattagtattaaggaca 

29 ASPLGPSSRIHVKSSGTNSLG F L L V L R T 

37902 ccaatgtatttcccagattctgcactaaaattagtccccaaaatgtcatctgegtatccaacaacaacttgggacccatttaca 

57 PMYFPDSALKLVPXMSSAYLITTWDSPT 

37818 gtttcccctgaaaggactccttcgccgtcctcatttagcaagtccatcaagtcttttcgagggtcttggaaaatgatagtagag 

85 VSPERTPSPSSPSXSIKSFRGSHKMXVB 

37734 tttgaaaggtcgtcgtag 37717 

113 F B R S 3 • 
dplORF0«l 

194 75 atggcgagaatgcaaagattatgcccgatgaaattttggaaggcggtaactaaaacgaaattcgaagtttaccctgcgcgacta 

1 MARNQRLCPMKPWKAVTKMKPBVYSARL 

19391 tttgacgaagaggcgacatatgataggtatcgtgaagcactagagaaagt.tggaaatgtcgcttacttttgtgaaattgatact 

29 FDEBATYDRYREALEKVGNVAYPCEIDT 

19307 ggcaaccttgtaatcgaacrcgagccagacagcctagatgacctaatcgcgctttcaaatgtagtgggaactggactaaaatta 

57 GNLVIELELDSLDDLIALSNVVGTGLKL 

19223 tcacggccceatagagaagataagcctcttcaattatggattgttgacgggtacacggaataa 19161 

85 SRPYREDKPFQLWIVDGYME* 
dplORF062 

45284 gtgagaagcttcaaecaactccattgcggcgtcaatatctcctCccttgacgagtctaaaaatcccgtcaatcgcccactcgta 

1 VRSFNQFHCGVNIFFLOBFKNSVMRPFV 

45200 agatgcaggagcaatagatgcaagaagtttcttttggtcttctgtcaacccttctgcgcgaactccaatagaaacaccttttcg 

29 RCRSHRCK KPLLVFCQPFCANSNRNTPS 

45116 agtttcttcgatagtaacgaagttcttcttagagcgataggcgatgtaagactgagtgacgattcgagtagacgcaggaaaggc 

57 SFFDSNBVLLRAIGDVRLSDDSSRRRKG 

4 5032 ttcaacaattcgactttcaagagccttagtaatcgccatcacgctttcttttttaggagcaggttttcgaacagtagatttetc 

85 FNNSTFKSIiSNRHHAFFFRSRFSHSRFIi 

44948 actaactga 44940 

113 T N * 
dplORFOO 

47200 atgaaattcactgaaggaaaaaac tggtataaagtcggagagatatgtcaaatgttgaaccgctctctatccacgactaatgtt 

1 MKFTB GKNWYKVGEICQMLNRSLSTINV 

4 7284 cggtatgaagcaaaagacttcgctgaagaaaataacattcacttccegtttgttcttcctgaacccagaacagaccttgaccat 

29 WYEAKDFAESNNIHFPFVLPEPRTD LDH 

47368 cgtggttctcgattctgggatgacgaaggcgtgaacaaacccaaacgacttagggacaacccaatgcgcggtgactcggcactc 

57 RGSRFWDDEGVNKLKRFRDNLMR G _ D L A F 

4 7452 tacactcgaactcttgtagggaaaactgaaagggaagcaattcaagaagatgctaaagcacccaaacgtgaacatggattggag 

85 YTRT LVGKTBREAIOEDAKAFKRBHGLE 

47536 aattaa 47541 

113 N • 
dplORF064 
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2910Q acggctacattgaaagctcctagcaccccaaccgtctccggagcagcagcgcattcagggcoggcatcctcccgceccgaagcg 

1 KATLKALSTLIVSGAVVHSGSVFSCPEA 

29192 cttgcttcgtctttaattgaacgcaattttgcgtecgagattaaggcggccgaagatggagaaacggtagaaactgttectcaa 

29 LASSLIERNFAFEIKAAEDGETVETVPQ 

29276 acaattgaatcagttgaagaaategacgaagctgaacaaaegcgcgaagagcatgcggctaaaaccgttcccgagcccgctgaa 

57 TIESVEEIDEVEQMREEYAAKTVPELVE 

29360 ttagcaagagctaatggaattgacatttcttcaatttctcgaaaaagcgaatataccgacgctetaactaagcacgaaccagga 

85 LARANGIDISSISRKSEYIDALIKYELG 

29444 gagtaa 29449 

113 E * 
dplORF065 

51497 atgcagtttgccataacctacatcaaacacctcgatgagctcgtccgtcaatttccgttcatacatataaggacgaataaaccg 

1 MQFVITYrKKLDELVRQFPFIHIRMNKP 

51413 gtatctatcaagttcctctccaggaacgactttatgcccgacttcttcagtcctcccattccttcgaaacgcttcagggctgac 

29 VFIKFLFRNDFMLDFFSSPISSKRFRAD 

51329 gccttgcctaactacttcgctagacgttccaaaattccttttcagccactggtttccatagaaccctccatcgtttcgacctaa 
S1246 

57 ALPNYFARCSKIPFQPLVSIEPSIVST* 
dplORF066 

29898 gtgaccaactgcgtcaggtggaagcaataccactttaccgtcgtcaatcaagttgaactgacgaatgttaccaacgtcaggaag 

1 VTNCVRWKQYHFTVVWQVBLTNVTtfVRK 

2BS14 tttgtcagcgtcagcgaactgagcaattttcttagagtagacagegatttgaagacctgttttttcagcgatgaatttctcagc 

29 FVSVSELSNFLRVDSDLKTCFFSDEFLS 

2B730 gtcacttgcaagaagcaagaagttttcccaagaaccttgaacaceaattgcaagagctttcttgatagagtcactcttagtcat 

S7 VTCKKQBVFPRTLNTNCKSFLDRVrLSH 

28646 ttggttataagtgtttcggttcaagaccattcgagtagggcgaacacctgtacgattttcgatgtcatccattgctgctaa 
28566 

8S LVISVSVQDHSSRANTCTIFDVIHCC* 
dpXORF067 

45061 gtgacgactcgagtagacgcaggaaaggctccaacaattcgactttcaagagccctagtaatcgccatcacgctttctttttta 

1 VTIRVDAGKASTIRLSRALVIAITLSPL 

44977 ggageaggttttcgaacagtagattcctcactaactgaaccaacttcttccggctgttccttaacttcaggaatttcttcctca 

29 GAGrRTVDPSLTEPTSSGCShTSQiaSS 

44893 aggacttcttttctaggcttgggaacgaccccaccctttcgagcaggtcgagcaactgcaggagcagcccttttagcaggttta 

57 RTS F L Q L G TT.L PFRAGRATAGAAFLAGL 

44 309 gcagccccttcttttttaggtccagttccatctcccattgtgtaccaacgctcgagagttgaagccgaaaggtga 44735 

85 AASSFLGSVSSSIVYQRSRVBAER* 
dplORF06B 

2 94 Si at ggcagctcaaacggacat tgaac tagt caaaaccaa t at cgataacga taae t ctccgtcaccaatgactgaccaaagtaec 

1 MAAQTDIBLVKINIDNONSPSPMT DQSI 

2953S tcagctcctttagacaagcataaatctgtcgcccatgctagttacatgatttgcttaacgaagacccggaacgacgtggtaacc 

29 SALLDKHKSVAYVSYMICLM1CTRNOVVT 

29619 cttggacctatcagtceaaaaggtgacgcagaccactggaaacaaatggcgcaactctattatgaccaatataagcaagaacag 

57 LGPISLKGDADYWKQMAQFYYDQYKQBQ 

29703 cttgaaactgatgaaaagtcgaacgctggttcgacaaCcteaatgaaaagggctgatgggacatga 29768 

85 LETDBKSNAGSTILMKRADGT* 
dpl(«r069 

20411 atgaaactttatcacgccactg&ttttgataatcttggtaaaattctagctgaaggattgaagccttcagctggagttatttac 

1 MKLYHATDFDNLGKILAEGLXPSAGVIY 

20327 ctagcagaaagttatgaaaaggctctagcctttttatcgcttcgaaatgttgatactattgtcgttctcgaacttgaagtagat 

29 LAESYEKALAFLSLRNVDTIVVLSLBVD 

20243 attgaaaaatgtaccgaaagtttcgaccataacgaaaagatgttttgcagcctatttcatttcgacacttgtcgcgcctggact 

57 IBKCTESFDHN EKMFCSLFHFDTCRAHT 

20159 tatgacaagacaatcgaagcagacgacattgacttttcgaaagctcgaaaatatgatagaaagtga 20094 

85 YDXTIEVDDIDFSXARKYDRK* 
dplORT070 

IS 973 atgataaccttatttaaaataaacagtgaaggaacagttactccaattaaagggtcagccatgcaactgtacgcagacettatt 

1 MITLFKINSEGTVTPIKGSAMQLYADLI 

16057 cctatacaagaggacgatatacagttcgttgatataactggacttgaccctattgttcgagaaaacgtacttgagctcatttca 

29 PIQEDDIQFVDITGL-DPIVRENVLELIS 

16141 cggagccgtgtaggagtttcaaaatatggtacaaacctcgaccagaatgatgtcgacgatttcctacagcacgccaaagaagaa 

57 RSRVGVSXYGTNLDQNOVDDFLQHAXBE 

16225 gcgctcgactttgctaactacctaaccaagctacaaagtcaacaaaagcaaaacaaatag 162 B4 

85 ALDFANYLTXLQSQQKQNK* 
dplORP071 

38904 gtgaaacaggtcctagaggagttcaaggccttcaaggccctcaagggctccaaggaattcctggacctgcaggagccgacggac 

1 VKQVLEBPKVFKVLKGFKEFLDLQBLTD 

38988 getcgcaacatactcacctcgcttcctccaacagcccaaacggcgagggatttagtcatactgacagcggacgagcatacgtcg 

29 VSMZbT8L8LIVQTVRDLVXLTADX.'ir-TS 

39072 gtcagtatcaagaccccaatcccgcccattcaaaagaccctgcagcctacacatggacgaaatggaaggggaatgacggagctc 

57 VSIXISIPSIQKTLQPIHGRNGRGMTEL 

39156 aagggatacccgggaagccaggcgcagacggtaagactaatcatttccatatag 39209 

85 KGYPGSQAQTVRLI IS I • 
dplORF072 

5104S atgttcctccgtcttcaagttgtctcgaaagttcttcaattatttgttcaggagtcgcttcaatctgaagaccatttactttca 
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1 MFLRLQVVSKVFQLFVQESLQFEDHLLS 

50961 tcaaaacgctccaaccccttcccttgtaacctcactccgaagacgagcagtcgacctagaggcctttgctttcaacggagagct 

29 SKCFNSFPCNLTSKTSSRPRGFCFQWRA 

50877 ttcgcctttttcagttccttcttcgccttcctctetgaatcctacaagagtataggctccagtttcaacgccccacatatattc 

57 FAFFSSFFAFLFESYKS IGSSFNVPHIF 

50793 gatgatttttcggtcttcgccacatcggtctttaacgacagatag 50749 

85 DDFSVFAISVFNDR* 
dplORP073 

14262 gtgaacgcttgccggaagaatacgacgaagaaacttgggaacctatcactgaagcagaatacatcaagcgaacagaaaaaccta 

1 VNACRKNTTKKLGMLSLKQNTSSEQKHL 

1434 6 aagcagttgcaaaacccactcgaaaaactccagcgccctctcgtcgccctcgcccttaaaagaaaggttgaaataaaatgtgtg 

29 KQLQMLLEKLQRLLVALALKRKVEIKCV 

144 30 aaaatcgtcaaaacgaaacattcaacaccagaactttcaatgaagatgaaagcggctatgtcgacgccccattcacttacaagg 

S7 KIVKTKHSILEFSMKMKVAMSTPHSLTR 

14S14 agattcgcgacaccgcagcagctattagcaatcgagcggtag 14 555 

85 RFATPQQLLAIER* 
dplORF074 

32298 gtgacgaaaagaaaaatccaggattgcaaatgettatggagtgactatettcagtcgctcctctttttgtatatagaaaggaaa 

1 VTKRKIQDCKCLWSDYFQSLLFLYIBRK 

323 82 ttacatggattttgggtcaattgcagcaaaaatgaccttggatatctcaaacttcacaagtcaattaaatcttgcecaaagtca 

29 LHGFWVNCSKNDFGYLKbHXSIKSCSKS 

32466 agcgcaacggctcgcactagagtcctcgaagtcctctcaaattggctctgctttaacaggattagggaaaggacttacgactgc 

57 SATARTRVFEVLSNWFCFNRIRBRTYDC 

32S50 ggccacccttccccttatgggatttgcagccgcccctattaa 32591 

85 GYPSSYGICSRLY* 
dplORF075 

22447 atggcaaagttccgtccgttgaattccgtcatggcccaaagggaaaatgaaagagccatcgatactgtttttcctgaacgaatg 

1 MAKFCPLNSVMAQREMERAIDTVFPERM 

22363 gaaccgcctgctatgacgatatcgaaagttcgaaaaggcgagccctttgtccaccatgttaggagctggagttgtttcttacta 

29 EPSAMTISKVRKGEPFVHHVRSHSCFLL 

22279 aaagggacgaagttgaacttaggtagtccacttctcaggcttattgtcattatcagtcactcctttaatgtaggaacctgttgc 

57 KGTKLHLGSLFIjRLIVI ISHSFMVGTCC 

22195 gtcactaaattcttgccaaacggcttgagctgctctatctag 22154 

85 VTKFLPNGLSCFI* 
dplORT076 

5728 gtgagagcattttcttcactcacgtctccgagcaagtggtcgaatgtagggtactcttcaccttctgtaacaatatcaatattg 

1 VRAFSSLTSSSKWSNVGYSSSSVTISIL 

5644 tactcaccattcccaataacttttagcgaagattcttcaggaactaatgtgacggttgcggccgtggtcttttctacaagtctt 

29 YSPFPITFSBDSSGTKVTVAAVVFSTSP 

5560 ccaaactgctccgctttcacaatcacgtcaatttcaacatcgctgtcgataatgcatcgaaggaagtttgagccatcatacgct 

57 PNCSAFTITSISTSLSIMHRRKPEPSYA 

5476 gtaaacatgacgcattcgccgtcaccaaaaatatgccaatag S435 

85 VNMTHSPSPKICQ* 
dplORF077 

14800 atggaacgaataaagacgctatttcacgtgatttatgctaacggcacccatttagaagtagcagctttgttcgacaccgttgat 

1 MBRIKTLFHVIYAKGTHLEVAALFDTVD 

14884 gattatgatgacgttatagaggacatccaggggtatattgatacccctgacctttacaaccaaaggagcattagaatggcgcct 

29 OYDDVIBDIQGYIDTPDLYNQRSIRMAP 

14968 tacaatcctgacatcaatggtgacgctattgctaccgacattttactacgactagatgatattatctacgtcgacgcaacttgt 

57 YNPDIMGDAIATDILLRLDD1 IYVDATC 

15052 gaaactattaaatacgaggagcctattgcatga 15084 

85 ETIKYEEPIA* 
dplORF078 

17507 atggcaacagtaaaggaaacagcaaaatttgacggacgtcttgtaaccatcttcgactacgacgatccagagtgggaaggatat 

1 MATVKBTVKFDGRLVTI FDYDDLBWBGY 

17423 gcacctaatgaaggattcgaagatgttgaggacatggaagtccttagcattcgagttcgaaacgaaggtgaggacgacgagtgg 

29 APMEGFEDVBDMEVLSIRVRMEGBDDEW 

17339 gtcgaagttatcgcctgccacgaaaacgacgacgaggacgaagatttggaagggrtataa 17280 

57 VSVIACYENDDEDEDLEGL * 
dplORF079 

35288 atggaactgataccattgataaaccctcgaacaaggttgacccctgcgcttaccatttgtccagcgaacccagtaaccttagaa 

1 MBLI PLINPRTRLiTPALTI CPANPVTLE 

35204 acaattgaagttcccatgctgccaactttagagacagctgaaccaatcattgacccaataccactaatgaagcttcgaatcagg 

29 TIEVPMLPILETAEPIIDPIPLMKFRIR 

35120 ttcgcacctcctgaaaccatctgtcccacaaagctagcaatcttgctaactaatgatgaaagcatgtttecagctgtcgataaa 

57 FAPPETICPTKLAILLTMOESMFPAVDK 

35036 agtgagccgagaagtgaagcaaeaccttga 35007 

85 SBPRSEAIP* 
dplORFOBO 

42490 atgttgaacctcacaaaaccgcgccaaattgtggcagagttcactattggacaaggagctgaaaagaaacttgtcaaaacaacg 

1 MLNLTKSRQIVAEFTIGQGAEKK L-V K T T 

42574 actgtgaacattgatgcaaacgcagtaccaaccgcctctgaaactcttcacgacccagacttgtatgctgcgaaccgtcgagaa 

29 IVNIDAMAVSTVS ETLHDPDLYAANRRE 

42658 cttcgagctgacgagcaaaaacttcgcgaaactcgtcacgcaatcgaagatgaaattctagctgaacagccaaagactgaaaca 

57 LRADEQKLRETRYAI EDEI LAEQSKTET 

42742 gccctaacagctgaataa 42759 
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85 A L T A E * 

dpiowoai 

55466 atgttcaggaacagtatcgtccatctgtcggtctgcgccaaagttaaaggggtcgaaaccttcgttctcgccagcgccgatara 

1 MFRNSIVHLLVCVKVKGVE I FVLASVDI 

S5382 cccgaacccgcattcaggaagactcacaccaggaagccctctccttcgaccggtagctgtttgaacacatcccaagtcccgcgc 

29 LELVFRKTHrRKPSSSTGSCi-NISQVLR 

55298 ctgctgttgaacgaatatgatatagtctgccactttagggaaetcggtgaagaaatcttcaataaccttattcgcttctttgac 

57 LLLNEYDIVCHFRELGEEIFMNLIRFFD 

55214 agatacattcatctgctcagcgactga 55188 

85 RYIHLLSD* 
dplOR7032 

44 728 gtgaacttcacctttcagcttcaactctcgaacgttggtacacaatggaagatgaaactgaacctaaaaaagaagaagctgcta 

1 VNFTFQLQLSNVGTQWKMKLNLKKKKLL 

44812 aacctgctaaaaaggctgcecctgcagetgctcgacccgetcgaaaaggtagagtcgtecccaaacccaaaaaagaageccttg 

29 NLLKRLLLQLLDLLEKVESFPNLKKKSL 

44896 aggaagaaattcctgaagttaaggaacagccggaagaagttggttcagttagtgagaaatctactgttcgaaaacctgctccta 

57 RKKFLKLRNSRKKLVQLVRHLLFENLIiL 

44960 aaaaagaaagcgtga 44994 

65 K K K A • 
dplOR7093 

35974 atgccttcagggtttttaaatcctgagtccttaaatcctgcgaaagtgagtcctacatattctagcacggttgcacctttgtcg 

1 MPSGFLNPESLNPAKVSPTYSSTVAPLS 

35890 acaaggtcaactccgtcgaccaatagcgtccgtctgctagccatetatttctccctcacggtgttacaatgttaccaaaccctg 

29 TRSIPSTNSVCLl-AIYFSFTVLQCYQTL 

35806 acagagttcctttactttctattacacaattcctcccgacagcctgtcaacgtcgtcaccgctccgaaccacgatcgccccaacgt 

S 7 IEFLYFYYTILSTVCQRRHCFELRLFQC 

35722 tga 35720 

85 * 
dplORF084 

1S44S atgaatcatatggtaaaagtcattctagctagtgtcttcgtactgtcagccttttgcatgacttgctcaatggtttacttggtt 

1 MNYMVKVILVSVFVLSAFCMTCSMVYLV 

15529 acaggtaagcaagaggaccaccgtagtaccgtcgcccttgtatttggcgctctcgt&agctctgcggcgttctattcgacactc 

29 TGKQEOHRSTVALVFGALVSSAAFYSTL 

15613 cttatcctcgcccatccgccatga 15636 

57 F1LAYLP* 
dplOR7085 

10847 gt gatgac t ataat caaggact 1 1 tt cgagc ct tgtgat actgt cacgcac t cct ccat 1 1 gcaagt 1 1 c ccaat aaacgaaag 

1 VMTIIKDFFBPCOTVTHSS1CKFPNKRK 

10763 ggcgtcacgctcataactataaccagctccttcttcattttcactttcgataataaattgaagttgattaacgatgtcgtcatt 

29 OVTLITITSSFFIFTFDNKLKLINDVVI 

10679 atcaattegagtaaagtcaaaccgttgaactcgactgagaatagtgtcaggaatcttttgagggtcagtagtacatag 10602 

57 IHSSKVKPLNSTENSVRNLLRVSST* 
dplORF086 

52760 a t atgggaaaagt atcaat t caaaaat caggaacat 1 1 agct c agggt ct aa t aacgagt 1 1 1 1 cacact cgctgaccacggt g 

1 IWEKYQFKNQEHLAQGLXTSPSHSLTTV 

52844 acagcgcaattgtcactctattgtatgacgacccggaaggcgaagacatggattatttcgtag S2906 

29 TAQLSLYCMMTRKAKTWIIS* 
dplORF087 

30036 a t gat t ttgcctt cat catat agaat gaaaat 1 1 tcac t ccat 1 1 tgggcaaaaat 1 1 1 1 c ccgcgt cagt agaat t ggct aaa 

1 MILPSSYRMKIFTPFWAKIFPASVELAK 

299S2 aggccaggaacagttgaattatcaaccaaacaaacaaggtcgtctgctacgacttcattcgctttatcctttttctttcctcca 

29 RSGTVELSTKQTRSSATTSFALSFFFPP 

29868 tatccatcactgacccaagagtttcgaagtaccttgattttagtaggagcggtttcaatggctctacgaacttga 29794 

57 YPSLTQBFRSTLILVGAVSMALRT - 
dplOBVOSS 

5040 atgaaaaaagttcaaacttatcaagaatatctaaaactagttgagttcaaacgtcaactttctttaaatcttegagaaggaaaa 

1 MKKVQTYQBYLKL.VEFKRQIiSLNLREGK 

5124 ataggagtcgatgaagcggttattcaattattcaccttecatagtttcaacaatatcgaggaacctcctttcattgtactcaaa 

29 1GVDEAVIQLFTFYS FNNI EBPPFIVLK 

5208 atgcaagaggctgccgtgaacgggacttatgaagcaaaactcaatatgcttaaaagatttaaaattatttag 5279 

57 MQEAAVNGTYSAKLNMLKRPKI I * 
dplOR/089 

12495 atgtcaatcatgtcgctatcaatagtcgagtaCttagacacaaaatgcetttttcaactgcgcgtcagtcattttctcaaactca 

1 MSIMSLSIVEYLDTKCLFNCASVIFSHS 

12411 acacaattatcaggaaaggcctttagcaacttgcttcgcttgtcaattttagtaaccatcaaaacaagtgtcccatatctaaca 

29 TQLSGKAFSNLLRLS I L V T I KTSVPYLT 

12327 tccggaagccttttccacctcgactcattagacagaaactccttatcatctcgaacagcgaatattcgatga 12256 

57 SGSLPHLDSLDRNSLSS RTANIR* - - 

dplORFOSO *' 

27037 atgctaaaactttcattgacggcgacggtcaacattttgtacctcacgcacgtttcgatgaagttgttcaacagcgcgatgcag 

1 MLKFSLTATVNILYLTHVSHKLFNSAMQ 

27121 ctaacggctcaattaatccttataaagaacaagtcgcgacgctttctaaacaggtcaaagataacggtgatgcgcagaccacta 

29 LTAQtlLIKNKSRRPLNRSKITVMRRPL 

27205 tccaaaaccttcaagagcaactcgacaagcagtctcaacttgcaaaaggcgctgtga 27261 

57 SKTPKSNSTSSLNLQKAL* 
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dplORI091 

43189 atgaaactatctaacgaacaatatgacgtagcaaagaacgtggtaaccgtagtcgttccagcagcgattgcactaattacaggt 

1 MKLSNEQYDVAKNVVTVVVPAAIALITG 

43273 cetggagcgttgtatcaatttgacactaccgctatcacaggaaccattgcacttcttgcaactttcgcaggtactgctctagga 

29 LGALYQFDTTAITGTIALLATFAGTVLG 

43357 gtttctagccgaaactaccaaaaggaacaagaagctcaaaacaatgaggtggaataa 43 413 

57 VSSRNYQKEQEAQNNEVE* 
dplORF092 

4 6989 atgaaaactatctccatatnaaggaaagacactaaaaggaagccggacaggaacggaagaaaaactgcactcgaactagctcaa 

1 MKTISILRKDTKRKPDRNGRKTALELAQ 

47073 gagattgatatgtcacctagtgagttagcagagctccttcaaattcctgaaaggacggcaaccagaattttaaaactcgacaaa 

29 BIDMSPSELAELLQIPERTATRILKLDK 

47157 ctgctcaacaaagagcaatgctcaataatagaaaggtatataaatgaaattcactga 47213 

57 LLMKEQCSI IERYINEIH* 
dplORF093 

45756 atgcaacatacgattaaacaaegtttgaaacttgccttcctgctaactgcaatatcaattgcctgtttagtcttccctaaacct 

1 MQHTIKQCLKLAFLLTAIS IACLVFPKP 

4S672 cgctcatcgcctaaaaggaaacatggacgctcttgtgcgtatccgaaacattcaacctggtgcgcgaatggagtagtctcgaac 

29 CSSPKRKHGCSCAYSKHSTWCANGVVLH 

45588 gaaaactgctcattgcttgaagaagctattcggtttcgagagtcaatgtag 45538 

57 ENCSLLEEAI RFRESM* 
dplORF094 

8281 atgtacgaattagttctatcactaaaattgacgccgacagcgccgatgtctcaagatgtcgaaaagtgcttcaaaaggctcaag 

1 MYELVLSLKLTPTAPMSQDVEKCFKRLK 

8365 tatattcagtggcggcaggtgaatgcattaaaattgcacacggatttgctcctgaacttcctaagggatatgaagcaatcttgc 

29 VIQWRQVNALKLHTDLLLNFLRDMKQSC 

8449 acectcgttccagtctttttaagaaaaccggtctaa 8484 

57 ILVPVFLRKLV* 
dplORT095 

8877 gegggaaaact ac 1 1 cagct ct cgacat t gt caagaatgcgcaaa tggt at t tgagcaggaa t gggaacagaagactgaagaac 

1 VOKLLQLSTLSRMRKWYLSRNGNRRLKN 

6961 tcaaggaaaagctggaaaatgcgcgtgcatccaaagctagcaagaetgctgtcaaggaacttgaaatgcaactcgatagtcttc 

29 SRKSWKMRVHPKLARLLSRNLKCNSIVF 

9045 aagagcctcctaagattgtatatcctgaccttgagaatacattag 9089 

57 KSLLRLYILTLRIH* 
dplORF096 

46681 gt gat teat aaatt cttcaact t cgttgaact tat ctgcggt 1 1 ct cctgt taccaggt t geat ttgac tgt ct tcgaaagt at 

1 VIHKPFNFVELICGFSCYQVAPDCLRKY 

46597 cttagcaagaggttcaataaccttttcccaattgctaaatatcacgcaggactttccttgctggatacattcctcgacaatttc 

29 LSKRFNHLFP lAKYHAGLSLLDTFLDMF 

46513 gatacatctttcgaacttgcaagacttgacatcttgagrtagttaa 46469 

S7 DTSFELARLDILSS* 
dplORT097 

39100 atggacgggattgaaatcttgatactgaccgacgtatgctcgtccgctgtcagtatgactaaatccctcaccgtttggactatt 

1 MDGIEIL1LTDVCSSAVSMTKSLTVWTI 

39016 agagaaagcgaggtgagtacattgcgaacgtccgtcagctcctgcaggtccaggaattccttgaagcccttgaggaccttgaag 

29 RESBVSILRTSVSSCRSRHSLKPLRTLK 

38932 accttgaactcctctaggacctgtttcacctatcttggaaactga 38868 

57 TLNSS RTCF TYLGN* 
dplORP098 

43627 gtgaaaatgctccgtgggatgctaaacgaggcgacatcttcatctggggacgcaaaggtgctagcgcaggcgctggaggtcata 

1 V'KMLRGHLMEATSSSGDAKVLAQALEVI 

43711 cagggatgt teat tgacagt gat aacat cat t cactgcaactacgcct aegaeggaat 1 1 ccgt caacgaccacgacgagcgt t 

29 QGCSLTVITS FTATTPTTEFPSTTTMSV 

43795 ggtactatgcaggtcaacctcactactacgtctatcgcttga 43836 

S7 GTMQVNLTTTSIA* 
dplOW099 

38298 atgcaagttcgccatctgctactgaagctceagctggtggatggtctacgcaagttcctaccgtcccaggtggtcagtatttat 

1 MQVRHLLLKLQLVDGLRKFLPSQVVSIY 

38382 ggactcgaacaagatggcgctacactgaccaaactgatgaaattggatattcagtttcaagaatgggcgagcagggtcctaaag 

29 GLEQDGATLTKLMKLO IQFQEHASRVLK 

38466 gtgacgcaggtcgtgacggtattgcaggaaagaacggaatag 38507 

57 VTQVVTVLQERTB * 

dpiosnoo 

1597 atgcagt tgacaccaagcgagc t c t at 1 1 ggat t taga actaegget gagaat atgt caaga 1 1 cct t acctggact c t caegg 

1 MQLTPSEFYLDLELRLRICQOSLPGLiSR 

1681 agcttatgtggaagcatgctcgtatcgactctatcaaactatgggaaactcctacaggttgcgcagaatgtacttactaegaga 

29 SLCGSMLVSTLSNYGKLLQVAQHVL_TT_R. 

1765 ttttcacagaagacgagattgaaatgttcaagaacgtaa 1803 - __■ — 

57 FSQKTRLKCSRT* - 
dplOSTlOl 

19220 gtgataattttagtccagttcccactacatttgaaagcgcgattaggtcatctaggctgtctagctcgagttcgattacaaggt 

1 VII LVQFP LHLKARLGHLGCLARVRLQG 

19304 tgccagtatcaatttcacaaaagtaagcgacatttccaactttctctagtgcttcacgacacctatcatatgtcgcctcttcgt 

29 CQYQFHKSKRHFQLSLVLHDTYHMSPLR 
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19388 caaatagtcgcgcagaataaacttcgaatttcattttag 19426 

57 QIVAQNKLRISF* 
dplORF102 

4034 atgataacgtgggaatgtttgactgtatcgccgaactcgataaaattcctggcgtatctagacagcctaagacacgtgaacagc 
1 MITWECLTVSPNSIXFLVYLDSLRHVNS 
4118 ttttggaagcaccacaaatttcttgggataactatctatacatgcgcgagcgaatggttgagaaagacaagcccccacctatcc 
29 FWKHHKFLGII IYTCASEWLRKTSSYLF 

4202 tccacatgggagaagactctaaatggctcaacttga 4237 

57 SIWEKTLNGST* 

dpiORrioa 

4 9352 ttgaatcatagacacagtaacatcacaaccatttctctttggcagattgtctttctttgtacttgctgcgcggtgccctattgt 

1 LNHRYSNITTIFLWQIVFLC ICCAVSYC 

49436 gcaggagtgcataatgagcgagagtctcaagataaggtgattcaaagttataagcagaaagaaaagtcagccgtctacttgaca 

29 AGVHNERESQDKVIQSYKQKEKSAVYLT 

49520 gccgatagttcaggagcttggctaggaagcgctccgggagccaaggaaagtcctccctacaatgaaaagggacagcatgcagga 

S7 VDSSGAWLGSAPGAKESPLYNEKGQHVG 

4 9604 aaattgaaagaggtgggagagtga 49627 

85 KLKBVGE* 
dplQRF104 

21427 atgagaaaaagagtgattttgaagccaaaaaggttgaactggtatgtccttaattcctaccctcgaatggttgagttttccgaa 

1 MRKRVILKLKRLNWYVLNSYSRMVEFPE 

21343 cttttgaacttttcgaatggttcgacctttcgaaggattgaggttttcgaaccggttgagttcctcgagcattctcgacttttc 

29 LLMFSNGSTFRRIEVFEPVEFFEHSRLF 

21259 gacccctttctatgctcgacttttcgagtgttttga 21224 

57 DPFLCSTFRVF* 
dplORP105 

2028 atgatagccgcatccaccagttcgaatgaaaatagtcttctgacctataaccattccctcaccttgaattgtaggaccgaaaat 

1 MIVASTSSNEMSLLTYNHSFTLNCRTBN 

1944 ttccatgataggcattctcccagggccgcgaacattgactcgaatcttgcctctttcaggctgattgtattgattaaccattat 

29 FHDRHFLRVANIDSNLASFRLIVLINHY 

1860 cctgctcctgctctaaaatttcgcggacagtaa 1828 

57 PAPALKFRGQ* 
dplOR710« 

10529 atgaacctcgtcaatgatgtaaactttgaactcgccgtccatagacttgtacctagaatcttcaataatgtttcgaacattecc 

1 MNLVNDVNFELAVHRLVSRIPNNVSNIP 

10445 taccccattattagaagcagcatcaatttcaataggagagccaagtcctttgttcacatccttcgcgaaaactcgagcagtagt 

29 YPI IRSSINFNRRAKSFVHI LRENSSSS 

10361 ggttttaccagttccagcgccaccacagaatag 10329 

57 GFTSSSATTE* 
dplORF107 

10750 atgagcgtgacgecctttcgtttattgggaaacctgcaaatggaggaatgcgtgacagtatcacaaggcccgaaaaagtccttg 

1 MSVTPPRLLGNLQMEECVTV SQGSKKSL 

10834 attatagtcatcacgttgacatggaagccgtttctaatgcactag 10878 

29 I IVITLTWKPFLMH* 
dplORFlOS 

49447 atgcactcctgcacaataggacaccgcgcagcaaacacaaagaaagacaatctgccaaagaaaaatagttgtgatgttactaca 

1 MHSCTIGHRAANTKKDMLPKKNSCDVTI 

49363 tctatgattcaatttcgctcacctccaatcctctcacattgcttgcctgaaaatctagaaccactgaagtatcatacatacgac 

29 SMXQPRLFPI LLHCLPENLBPLKYHXYD 

49279 tacaaagccttcggcctaaaaggtcaataa 492S0 

57 YKAFGLKGQ* 
dplORP109 

31632 atgtggtcgtcgaagtcccaaatagttgattctccttcaactttccagcctttgaaagccttacctgttaaggtagggtcaacc 

1 MWLSKSQIVDSPSTFQPLKALPVKVGST 

31548 ggctttggagaaatcttcttacctgcttcaacccgaactgcgtcggcggttectgttccaccgttcaaatcgaatgtcacgcga 

29 GFGEIFLPASTRTASAVPVPPFKSMVTR 

31464 cgaagaaccgccggaagttgtgccacatag 31435 

57 RRTAGSCAT* 
dplORFUO 

16444 atgattttcaactetagcatcaacttgcatgtcgcgagtaagtgtgactccagttccagcgacaggacatgctttgaatactgca 

1 MISILASTSMSRVSVTPVSATGHALMTA 

16528 atgtcaagttcgctctttctaacaactgagcctaggtctaagtacaagttaggattgattccagtgaccttatattgtttctca 

29 MSSSLFLITEPRSKYKLGLI PVTLYCFS 

16612 gtttcttttacaggaatgctttcacag 1663 8 

57 VSFTGMLS * 
dplORPlll 

28657 gtgactctatcaagaaagctctcgcaactggtgtccaaggttcttgggaaaacttcttgctecttgcaagtgacgctgagaaac 

1 VTLSRKLLQLVFKVLGKTSCFLQVTLRN 

28741 tcatcgctgaaaaaacaggtcttcaaaccgctgtctactctaagaaaattgctcagttcgccgacgctgacaaacttcctgacg 

29 S S L K JC Q V F K S L S T LR X L L S S L T f* T N_ P- L~ T 

28825 ttggtaacattcgtcagttcaacetga 288S1 _ 

57 LVTFVSST* 
dplOR7112 

32207 atgcaaactgatttaggcaaatactgcttcgacgcagcagccgttgcttatattagatatttgcaggaagacaagacccctagg 

1 MQTDLGKYCFDAAAVAYIRYLQEDKTPR 

32291 tatcccggcgacgaaaagaaaaacccaggaccgcaaatgcttatggagcga 32341 
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29 YPGDEKKNPGLQMLME* 
dplORPlU 

17715 atgaaaacagttaaagaagcaatcaaacaattcggtgatgaatggtggtacgaaattatcaacgaaaacggccaaatgattcaa 

1 MKTVXEAIKQFGDEWWYEIINENGQM1Q 

17631 gacggaagaatcgaagacatgggcgaatacatggaagaaacggtcgaccaagttaagttcatcaactatggtgacatcgaatct 

29 DGRIEDMCEYMEETVDQVKFINYGDIES 

1754 7 caaateaccaaactatatatcgcataa 17521 

57 QIZKLYXA* 
dplORFH4 

529S2 atgctattggcgaagacggggaaacagtccatcctgataattgtccactatgccaaaacggattccctcgtactgaaaaaccac 

1 MLLAKTGKQSILI1VHYAKTDSLVLKNY 

S3036 ttcctcaactctacaaccatgacacgggaaaagttgaaacatgggaccgaggccgctcttacgttcaaaagaccgttacactta 

29 FFNFTTMIRElCLKHGTEAVLMFKRLLHL 

53120 ccaataaacatggaagccttgtga 53143 

57 SINMBAL* 
dplORFUS 

5342 acgagcctcccctttttgacacatataatatacacgaattatcgcgagtttgtaaagccgttcccaaataattttaaatctttt 

1 MSLLPLIYIIYTNYREFVKPFLNNFKSF 

5258 aagcatattgagttttgcttcacaagtcccgttcacggcagcctcttgcattttgagtacaatgaaaggaggttcctcgatacc 

29 KHIEFCFISPVHGSLLHPEYNERRFLDI 

5174 gttgaaaccatagaaggtgaataa 5151 

57 VBTIEGE* 
dplORPHS 

20662 acgaaactttcaaaccttgctaaagcacttactaatgaatacctaatggcagtgaacaatgaccaagctgaagtcttaggcgca 

1 MKFSMFAKALTNEYLMVVNHDQAEVLGA 

20573 ggaaatatcgaaaacattctcaacggttcgaactttgctaatgttgtagctgaagcgacagrttttaaaactcgaaaaactcagc 

29 GNISNILNGSNFANVVABATVt.Kt»BKLS 

20494 gaagaggaagctaccgagtag 204 74 

57 B B B A I S * 
dplORF117 

24680 atgataacaggctgctcgaacactttaaatcgaagtgaatctcgtaagtcactaatagttttgttcaagttatctgctactgtg 

1 MXTGCSNILMRSESRKSLIVLFXLSATV 

24596 ataaggtctttgacatcgctcgtcccgcatatgtcactagtcaatggttcattaagaataactcgacaaggaatttgcttcaag 

29 IRSLTSLVPYMSLVNGSLRITRQGICPK 

24512 ccggttggggcggattcctga 24492 

57 PVGAOS* 

dpiosma 

15033 atgatattatctacgtcgacgcaacctgtgaaactactaaacacgaggagcctattgcatgaacaatcagcgaaagcaaatgaa 

X MI LSTSTQLVKLLNTRSLLHEQSAKANE 

15107 caaacgaatcgtcgaacttcgcgaagactatcaacgcgcaagaggtcgaataaacttccttcttgctgtaaaggaccacggcga 

29 QTNRRTSRRLSTCKRSMKLPSCCKGPRR 

15191 agaactcgaaaaccctga 15203 

57 R T R X P * 
dplOBP119 

41054 atggaggttcaacacccccgattcagtacgccctaccttttcgggcatctcttcagcagacacgacttcagcggttcgacagat 

1 MBVQHPRFSTSYFFGHFPSRHDFSGSTO 

41138 tttaacagggaaeaacctcctccaaatcatgtcgaacattcaagccaacttcaacaacgcttccggcgctcacggatccactat 

29 FNRBQIiPPNHVEHSSQLQQCFRRtR IHY 

41222 ccaagcacttcacgctga 41239 

57 P S I S R * 
dplORTUO 

28387 gcgttgaagcgcaagcagaatacatgcgtatgcaatcgcctcaaeacggtaaattcaccgtcaaatcaaetaacagcgaggctc 

1 VLKRKQNTCVCNCFNTVNSLSNQLTARL 

28471 aatacacttacgactacaaeatggatgctaagcaaeaataegeagtcactaagaaatggaeeaacccagctgaaagtgacceta 

29 HTLTTTTMMbSHNMQSLRllGbTQLKVTL 

28555 ccgccgacattttag 28569 

57 S L T F * 
dplOBFUl 

39222 gtgcagacggatcacgtgagttcagtttggaagataataaccaacaatatatgggtcattactccgattatgagcaagcagata 

1 VQTDHVSSVWXIIINNIWVITPIMSXQI 

39304 geagggatcgaaccaagtategatggtctgacegcctcgceaatgetcaagtgggaggtcgaaacgagttccetaattecteat 

29 AGIELSIDGLTALPMFKWEVETSSLILV 

39390 ttgaatttggtttaa 39404 

57 L N L V * 
dplOKHH 

40402 atgttattctccttatcctacataccgaatcacgttcatgtctggattaaacgagcattgttccgttctaaatcggccgacttg 

1 MLFSLSY'IPNKVKVWtKRVLFRSKSADL 

40318 aacggattgggtaaagacccegtcatcgatgtgaacgaaccctcgcgcaaggcacataactccaccccctgcggagaacataga 

29 NGLGKDPVIDVNE PLRX VHNF I P-C G__E~H ~R 

40234 aattcggtcacttga 40220 * 

57 N S V T * 



dplORF123 

21327 atggctcgacttctcgaaggaccgaggttctcgaaccggctgagtcttccgagcatcctcgaccccccgacccctctccatgcc 
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1 MVRLFEGLRFSNRL.SFSSILDFSTPFYA 

21243 cgacccctcgagtgtcttgaggttttcgagcaggtccgacccttcgagaaatcgagcttttcgacctctaaattaggcccgatc 

29 RLFECFEVFEQVRLFEKLSFSTSKLCSI 

21159 attcgaaaagtttag 21145 

S7 I R K V • 
dplOR7124 

17891 atggtaaaagttaaagattegcaagtaggaatgaaagctgtaaatgcaaaaggtactgaattcaaageaactgaccgtcaaggt 

1 MVKVKDLQ-VGHKVVNA KGTE F KVTDRQG 

17807 cgtaaaegggtaagcctagaacgtcttagtgaeggacgtattcggttctatgataacgaatcactaatggacgaaaaagtggag 

29 RKWV5LERLSDGR IRFYDNES LMDEKVE 

17723 gtagtaaaatga 17712 

57 V V K * 
dplORTiaS 

49916 atgtcctcagccgcttccgttaaaattggaacaagtgaattatatagatgcccctccttcagcccgtcgataaggtattcatca 

1 MSSAASVKIGTSELYRCSSFSLSIRYSS 

49832 gtctcgccaatttcgaaaaattcgaatccaggaaaatggccgagaatagtttcgtcgtccggaactcttccacatcccgaaaag 

29 VSPISKNSNPGKWSRIVSSSGTLPYLEK 

49748 tgttcttga 49740 

57 C S * 
dplOR7126 

16136 atgagctcaagtacgctttctcgaacaatagggtcaagtccagttatatcaacgaactgtataccgtcctcttgtataggaata 

1 MSSSTFSRTIGSSPVISTNCISSSCIGI 

160S2 aggtctgcgcacagttgcatggctgaccctctaattggagtaactgttccttcactgtttattttaaataaggtcatcatttct 

29 RSAYSCMADPLIGVTVPS LFI LNKVI IS 

1S968 accccctaa 15960 

57 I L • 

dpioari27 

13511 acgctaaatagctttcccactcaccgtcgctgttcttgcgccatttctcagtttcacgataccgaccaactttgcaaaggtcgt 

1 MLNSFPIHRRCSCAI FQFHDTDQLCKGR 

13427 gaaatagtgctacgattgcaactgtttccattgggtaaacgccttcccagcctttgcctaceatggtatccatttcgaaaagta 

29 EIVLRLQLFPLGKCLPSLCLPWYPFRKV 

13343 gttgattga 13335 

57 V D ♦ 

dpiORFiaa 

4852 atgacagcagttcaaeaagttaagttctacttagaagaagccggcgctcactttctaaaagatgttgagtacagtgacaactta 

1 MTAVQQVKFYLEEAGAHFLKDVEYSDNL 

4936 gagcaagcaattatgaaagatattcetaaatggaatggcgctcatagagatgagcacgaeaegaaaataacttcatacgaagta 

29 EQAIMKDI LKWNGAHRDEHDMKITSYB V 

5020 ttatag 5025 

57 L * 
dplOR7129 

25133 atgaactttctgctaagcaacttgcgctcactgaagttcaaactaatgtacgcagccaccaatcttacactgaagaattcagta 

1 MNFLLSNLRSLKFKLMYAATNLTLKNSV 

25217 agaaggaaaaggcggacaaggaatgggaacgcattttggaagaacttgctcagcttgacgaaaceecagctggagcattgectg 

29 RRKRRTRNGNAFWKNLLSLTKSQLEHCL 

25301 tattag 25306 

57 Y * 

dpiosruo 

16789 gegcttgactttattcctttattatcgtataatcataatataaataaaacaagcgtcaaggacgcagaaagaggtcaattatgg 

1 VLDFI PLLSYNHNINKTSVKDABRGQLW 

16705 aaacaacactttatttcggtcatcttacagcagattggaaagacggtcacaagaactacactttccactatgaaagcacccctg 

29 KQHFISVI LQQIGKTVTRTTLSTMKAFL 

16621 taa 16619 
57 

dplORF131 

43846 atgctcaaccggctgagaagaaacttggctggcagaaagatgctactggtttceggtacgctcgagcaaacggaacttatccaa 

1 MLNRLRRNLAGRKMLLVSGTLEQTELIQ 

43930 aagatgagttcgagtatatcgaagaaaacaagtcttggttctacttcgacgaccaaggctacatgctcgctgagaaatggttga 
44013 

29 KMSSSISKKTSLGSTLTTKATCSLRNG* 
dplORF132 

15304 gtgaceggaaggtcatctaatacacatagcctcaagacatttcgttggctttcaggaaaacactcgaccagattgtcaatgtat 

1 VTGRSSNTHSLKTFRWLSGKHSTRLSMY 

15220 cccacaaaggcttcaaggttttcgagttcttcgccgtggtcctttacagcaagaaggaagtccatccgacctcctgcacgttga 
15137 

29 PTKASRFSSSSPWSFTARRKFIRPLAR* 
dplORF133 

8061 atgactccttcattcatgacaagttttcgagttcctgcttgcttgtcaggaatagctttcccggcggctaaaatgtacagatt;a 

1 MTSSFHTSFRVSACLSGI VFPAA"KM _3£ *R I* 

7977 ccgtatctttctttcccgacagcagaacccgaatccatttgtatccccaccacttccgccccatctgcggcgaaataa 7900 

29 SYFSFLIAELES I CI PTI SALSAAK* 
dplORP134 

4 98 acgaceccaatgtacttaggctccatcaattcatacaagccattcaaaataatgttcatgcaaccttcgtggaagccaccgcgg 

1 MTSMYLGSINSYKSFKIMFMQSSHKSPW 

414 ttacggaaactgaataagtacaatttcaatgatttagattcaaccatcttttcgtttggaatgtaa 34 9 
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29 LRKLNKYNFNDLDSTIFSFGM* 
dplORF135 

780 acgaagcdgaacccgaaaatgccgctaacgccgcaatgttctacggagtcaagttcaccacccctgaaattgacccgaaaacct 

1 MKQNLKMLLMLQCSTESSS pflkltrks 

864 acecaagctctagctcttccttattacaaggaaaaggcgaaatctcacatggaaaatcttacgctgaaatcctag 938 

29 TQALALPYYKEKAKFHMENLTLKS* 
dplORF136 

55252 gcgaagaaatcttcaataaccttatccgcctctttgacagatacattcatctgctcagcgattgagtcagccccgcggccgtac 

1 VKKSSITLFASIiTDTFICSAI E L A P R P Y 

55168 ataagacccaaaagaacggacttgacagaacctcttcgaagttttccttcctcgttagtcgttccgtcgggatag 55094 

29 IRPKRTDLTEFLRSFPSLLVVPSG* 
dplORFU? 

3 7146 atgcttcgaacttgtttgttagcaccgtcaggaggacaaactagtcgaacccattcacctgcgtcttcgataatatctagcgcg 

1 M LRTCLLAPSGGQTSRTHSPA.SLIIS SA 

37062 acagcgcctacagaagaagcaacgtgttccaacttcctaggcaagccttctgctagttcataccacaatgcgtag 36988 

29 TAPTEBATCFNFLGKPSASSYHNA- 
dpIORFISS 

30662 atgactatatcgaagaacaatgtagtcacccggcctatctgtatcttgctcgtoaaattcaacccccggaagcacaggagcagg 

1 M7ISKNNVVIRPICILLVKFNSWKHRSR 

30578 cgagagctgaaatgtaggaagaatttccctcaatctgtccaccattgtcgttcgtttagtcacgtccactcctag 30504 

29 RELKCRKNFLQSVHHCRSFSHVHS* 
dplORF139 

12092 atgatactaaatcactcaacttgtttgaccctcctgataaattcgtccacgcagacacgcgcatttgagcccttcctagatacc 

1 MILNHSTCLTLLIMSFTQTRA FEPFLDT 

12008 tctcgcaaacacccag&tgcttccctcactaaaaggtcatgggcctcaagttcttcgaaagacatttctacatag 11934 

29 FRKHLDASLTKRSWASSSSKD 1ST* 
dplORF140 

20562 atgttttcgatatttcctgcgcccaagacttcagcttggtcattgttcaccaccattaggtattcattagtaagcgctttagca 

1 MFS IFPAPKTSAWSLFTTIRYSliVSALA 

20646 aagtttgaaaatctcacttcattttccctttatttgtttttctttatactattattatacaataatgattga 20717 

29 KFENFILFSLYLPFFXLLLYNND* 
dplORF141 

42922 gtgccaagagttgtagagacatcctctaaaacgctcttggctttattcgatctccattcgaataacttacttagtaggacagta 

1 VLRVVEISSKTLLALFDFHSNNLFSRTV 

42838 agcactccgctgcacgctgtaataatcgtcgtcaagactgctgtgtcgcctagccacatcggcatagattga 42767 

29 STPLHAVIIVVKTAVSPSKIGID* 
dplORF142 

31898 grgactgtcgaagcttctccaaacagttctgtcactttacctaaaagcgcattagggatttteccgttagcgattaggttcatg 

1 VTVSVSPNSSVTLPKSVLGIPPLAIRFM 

31814 acacctgccgcccgaatcttaacatggataggttcactaccttttgaaaatcccggaagtgcgacgatttga 31743 

29 TPAARILTWIGSLPFENPGSAMI* 
dplORF143 

7565 atgaagcctgggttgacgcttttaactccagaccgtttaatttttccaaggcttgaaattggacaccacataatcttttcatgc 

1 KKFGLTfcLTPDRLIFSRLSIGYHIIPSC 

7481 c c t eggaaat acactaaaat t ccggcgagaa t aaat t tgcac ccat ctgcgcgt gat age t ggaaccat tga 7410 

29 FWKYTKIPARINLHPSARDSWMH* 
dplORFlt* 

36517 gtgcaaatcaagcgactaacttattcagacacaccaaacgaggcgcattcttcaagatteccaatggaaaeccaacaactacca 

1 VQIKRLTYLDTLNSAHSSRFLMEIQQLP 

36601 ttgaataccgagccgatgacgcagcagctcggacctctactcttcccgctcaagttgaaccgtttccaa 36669 

29 LMTBPMTQQLGPLLFPLKLNCF* 
dplORFldS 

42067 acggaaacagccggagacctaacaagtggaaagaggttctacttaagcaagacttcgaacagaataattggcagaaacctgttc 

1 METAGDLTSGKRFYLSKTSNRIIGRHLF 

42151 tccaaagcgggcggaaccatcactcaacccatggcgacgcactctacccgaaaactcttgacggcacag 42219 

29 FKVGGTITQPMATKSIRKLLTA* 

dpiOBm* 

51484 acgacaaactgcatgattgcatcacctttccagcacggaacctcaagggcgaaacagtattcctcaaccgtcgaagtgttcgtt 

1 MTMCMIASPFQYGTSRAKQYSSTVEVFV 

5 1 5 6 8 c t aagtt t caccagt aeggt gaagat gaccct aaaacggaat c t ct 1 1 acggccaatatgagc t tgtag 51636 

29 LSFTSTVXMTLKRNFFMANMSL* 
dplORF147 

55207 atgtatctgtcaaagaagcgaataaggttattgaagatttctccaccgagttecctaaagtggcagaccataccacattcgctc 

1 MYLSKKRIRLLKISSPSSLKWQTISYSF 

55291 aacagcaggcgcaggacttgggatatgttcaaacagctaccggtcgaagaagaaggcttcccgatatga 553 S9 

29 NSRRRTHDMFKQLPVEEBGFIil* 
dplORF148 

28636 gtgcttcggttcaagaccattcgagtagggcgaacacctgtacgattttcgatgccatccattgccgccaaaatgtcagcgata 

1 VFRFKTIRVGRTPVRFSMSSIAA'KM-S-Tvl 

26552 gggtcactttcagccgggttagtccatttcttagtgactgcacattgctgcttagcatccatgttgtag 28*84 

29 GSLSAGLVHFLVTAYCCLASML* 
dplORP149 

26474 acgccaccgaacttctcgagcataaggactaaccctgccccattgccccactccagctgcggcggaacggccaacggtagttcg 

1 MPLNFSSIRINLAPLSHSSCGGMAMGSS 

26390 agcaagccgaagggcattgtattcgagattttgacacttatgagcagcaggtttccctag 26331 



WO 00/32825 



PCT/1B99/02040 



389 

29 SKSKGIVFEILIFM'SSRFP* 
dplORFlSO 

1518 5 gtggtcctttacagcaagaaggaagcttattcgacctcttgcacgttgatagtcttcgcgaagttcgacgattcgtttgttcat 
1 VVLYSKKEVYSTSCTLIVFAKFDDSFVH 
15101 ttgctttcgctgattgttcatgcaataggctcctcgtatttaatagcttcacaagttgcgtcgacgtag 1S033 
29 LLSLIVHAIGSSYLIVSQVAST* 
dplORF151 

28027 atgattatatcaacgcaggggagattgctagctacattcaagcacttccttcaaacgctcttcaataccttggaccaactcttt 

1 MIISTQGRLLATFKHFLQTLFNTLDQLF 

2 8111 tccctaatgctcaacaaacagggacagacatttcatggctcaagggtgcaaataacttgccagtaa 28176 
29 SLMLNKQGQTFKGSRVQI ICQ « 
dplOR7152 

42235 atgtgcataaaggacttatcgacaaagaggctactattgcagtacttcctgaaggatttagaccgaaagtttcaatgtatcttc 

1 MCI KDLSTKRLLLQYFLKDLDRKFQCIF 

42319 aggctcteaataactcatatggaaatgccattctatgtatatacactgacggaagacttgtggtga 42384 
29 RLSITHMEMPFYVYTLTEDLW* 
dplORF153 

22307 atggtggacaaagggctcaccttttcgaactttcgatatcgtcatagcagacggttccattcgttcaggaaaaacagtatcgat 
1 MVDKGLTFSNFRYRHSRRFHS FRKNSIO 

22391 ggctctttcattttccctttgggccatgacggaattcaacggacaaaactttgccatctgtggtaa 22456 
29 GSFI FPLGHDGIQRTKLCHLW* 

dplORP154 

18446 gtgacaataggctttaagaactgcaaaaaaacctggggcgtctgcacgcgcaacctggagctccttaacagtcatccaaggctg 
1 VTIGFKNCtCKTWGVCTRNLE LLNSHPRL 

18S30 aggtttcttacaaacaatcctaattccttcaaaatagctcttgtccgggtcaatagtgcctaa 18S92 
29 RFLTNNPNSFKIALVRVNSA* 
dplORF155 

13512 atgaatacgaccctgagcaacttacaatgggacatggtgcaaaatctaatttcctccttcaacgtttcattcaactcacgccag 
1 MNTTLSNLQWDMVQMLISFFNVS FNSRQ 

13596 ttgaagctcaagcaattttctggcatatgggagcctatgatattagtccttatgcaaatttga 13658 
29 LKLKQPSG I W E P M I LVLHQI * 

dplORF156 

18777 atgctagtatetccatttctgttggtcttgctttttagctctgttcagttcagctgcttctcgcgatgcaatagtttcgagaat 
1 MLVSFFLLVLLFS5VQF5CFSRCHSFEH 
18 861 atgcctgttcataggctcacaatattccgccaaagatttgccagttatggtggcgtcaattaa 18923 
29 M P V H R L T I FRQRFASYGGVN • 

dplORF157 

13281 gtgct tgctggact tgagaagaaat t ggt at cat t t 1 cgagccaat ccat aaggt t ct cgat accgt c acgat tgattgt tt c t 

1 VLAG. LEKKXjVSFSSQSIRFS I PSRLZVS 

13197 gttactgctttcttgaagcgttttttaaagtctgteatattagacccctttcattttctataa 1313S 

29 VTAFLKRFLKSVILDPFHFL* 
dplORF158 

40727 gtgaacgccgttattagggtcaaacgaagcccaaacggacattgtctttgtcccgtcactattgtgaggaacagtcacttctcc 

1 VNAVIRVKRSPNGHCLCPVTIVRNSHFS 

40643 actt gcgagcgt t acct c 1 1 cgccggacgtgt eg t agt ctgggtgac tget atgaacac ttga 40581 

29 TCERYLPAGRVVVWVTAMNT* 
dplORF159 

30371 atgatttggtctgcgcttacccaagcagcttctcctttgagtttctgtcgagcattccctgtacggtctgtccaaatagcatgc 

1 MIWSALTQAASPLSFCRAFPVRSVQIAC 

30287 gtctttgcgtattcttccaccteagtagcagcgacttcgcagactgttatgacagcgactcga 30225 

29 VFAYSSILVAATSQTVMTAT* 
dplOKTlfiO 

41324 atgggt t acagacacgcgaggaaaacaat egaaegt ecaagaegt a t c t at caatgt t at agaat actatggaccgt ct at caa 

1 MGYRHARKTI BRPRRIYQCYR I LWTVYQ 

41408 tttctccgttcaacgtactcgtcaaaatcctgcaattatccaagctcttcgaaatgctaa 41467 

29 FLRSTYSSKSCNYPSSSKC* 
dplORPKl 

52175 atgcaaaaaggtttaaatgcttatctcgacatgacattgaaagcattgcattcgagactatttcaaaatgtttggcaacgttca 

1 MQKGLKAYLDMTLKALHSRLFQNVWQRS 

522S9 aatcaaaccaaggggccaagttttcaacttaccttacaagactcttcaagaatagaatag 52318 

29 WQTKGPSFQLTLQDSSRIB* 
dplORF162 

13020 atgacagaagttgcggtaaatagcccgcaaaaggtgagagtagttatggtcgggaatattgaatttctcgaatatttaaaaagg 

1 MTEVAVNS PQKVRVVMVGNI EFLBYLKR 

13104 aagtacggaacagaaacttccatcagttatattatagaaaatgaaaggggtctaatatga 13163 

29 KYGTETSISYIIENERGLI* 
dplORF163 

40224 gtgaccgaatttctatgttctccgcagggaatgaagttatgtaccttacgcaagggttcattcacatcgataacgggatcttta 

1 VTEFLCSPQGMKLCTLRKGSFTSITGSL 

40308 cccaatccattcaagtcggccgatttagaacggaacaatactcgtttaatccagacatga 40367 - "_■ — 

29 PNPFKSADLERNNTRLIQT * _ 
dplOR7164 

6696 atgtactctcggagaacttcgtgcctaaacgttccagcttcgcccattgcaattaggttagaatctgcgttatctataatagac 

1 MYSWRTSCLNVPASPIAIRLESALSIID 

6612 tcaccgattctttcgaaatacattcttcgaatacatccaccaaccccgctgggcttataa 6553 

29 SPILSKYI FRIHPPTPLGL* 
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dplORFlCS 

50504 atgagcgaaagctggccaatccccaccacagatggtctataettagatatcatgctatctaaaattgcaggggtaaggttcttt 
1 MSESWSIPTTDGLYLDIMLSKIAGVRFF 
50420 cctccaatcataaagggcgtgactaccacaagggaatttccagcctcagtcattgcttga 50361 
29 PPIIKGVTTTREFSASVIA* 
dplORP166 

23519 gtggtcatgctctttaatgactctatctcctcccgtttggctcgctttactgtcccagctgtaagcatagtattcatcaatgtc 
1 VVMLPNDSIFSRLARFTVPAVSIVFIHV 
23435 gtgcgtgttgetagggtcgagtgtaaatctattctcagccaagagttcagcgtgaaatga 23376 

29 VRVARVECKSILSQEFSVK* 
dplORF167 

1008 atgcttattcggrtggagcttcttacgtcgtatatggtgctcacgcagacgatgcggctggaggtgcttaccctgactgcactc 
1 MLIRLELLTSYMVLTQTMRLEVLTLIAL 

1092 ctgagttceataattcaatgtcaaatgcaatggaatatggaactggaggcaaggtaa 1148 

29 LS S I IQCQMQWNMELEAR* 

dplOBV168 

5434 S atgagacttttcccaggttatattcttcacattgttcagttcctggagtcaagtattgttcttgaaattcatagagctcgaaag 
1 MRLFPGYILHIVQFLESSIVLEIHRVRK 
54261 t tegcaaagggt cataggccgcaeacataeaggcaacat caggaggaac taaac t aa 54205 
29 FAKGHRPHTYRQHQEELN* 
dplORP169 

459S4 atgaacacagcatcgcgaagagtctcaatgetagtgacaaggaagaattcgtcgttggccaccaagcaagtcccctgcccgttta 
1 MNTASRRVSMLVI RKNSSWPPSKSSARL 

4S870 gaaactccgteaatcactaatttcccatccttagtgactcgacttcctaaaatatga 45814 
29 ETPSITNFPSLVTRLPKI* 
dplORP170 

27600 atgat ga 1 1 gt t ct tgtgct cctgccgt 1 1 gtt gagcagcagcaag tt get caeca aaagagc cgat tt cacgaggt t egggaa 
1 MMIVLVLLPFVEQQQVAYQKSRFHEVRE 
27516 caccaccaecgacacgacctggaectcetaaatttccagtcccggctggcgacteag 274 60 
29 KHHRHDLDFLMFQSRLAT* 
dplORP171 

47678 atgccatttttctttcatgtacccttttagagcaccacgaagactttcgacttgttCctccatgtcgcctttggCagcatttaac 
1 MSFSFMYSFRASRRLLTCFSMSPLVAFN 
47594 tcaccggcfctcttcaattgcagcgatgaactgttttttcatctccaaatttcattta* 47538 
29 SPASSIAAMNCFSSSNFI* 



dplORF172 

10462 atgtttcgaacattttctaccccaccattagaagcagcatcaatttcaataggagagccaagccctttgttcacacccttcgcg 
X HFRTPSTPLLEAASISIGEPSPLFTSFA 
10378 aaaattcgagcagtagtggttttaccagttccagcgccaccacagaatagatag 10325 
29 KIRAVVVLPVPAPPQNR* 
dplORF173 

32160 atgacattagacatttccttcgtctgtacgaaaggtttcagcttgagtcacttcaccgtacaetgcactgaagattgtcataag 
1 MTIiDISPVCTKGPSLSHFTVHCTEDCHK 
32076 ccgctcacctgtcatatactcgccgacttcagcgcaagtaggctctaccaccga 32023 
29 LLXCHILADFSVSRLYH* 
dplORT174 

29766 acgtcccatcagccctttccattaagattgtcgaaccagcgctcgacttttcaccagtttcaagctgttcttgcttatactggt 

1 MSHQPFSLRLSNORSTFHQPQAVLAYIG 
29682 cataatagaactgcgccacttgttcccagtagtctgcgtcaccttttagactga 29629 
29 KNRIAPFV5SSLRHLLD* 
dplORT175 

15648 atgcgcgtgatgtcatggcagacaggcgaggataaagagcgtcgaatagaacgccgcagagcttaegagagcgccaaatacaag 
1 MRVMSWQXGEDKECRI ERRRAYESAKYK 

15564 ggcgacggtactacggtggtcctcttgcttacctgtaaccaaataaaccattga 15511 
29 GDGTTVVLULTCNQINH* 
dplORF17< 

43031 gtgataaagacggtaacgttgaattcttctagttccgtcttgaatgacgccattttggtgattgatcgctactgtcgtttggtc 
1 VIKTVTtiNFSSSVLNDVI LVIDCYCRLV 

42947 aatcccgtcgacctgctgcttaagagcgctaagagctgtagagatatcctccaa 42894 
29 HPVDLLFXSAKSCRDIL* 
dplOWTX77 

19937 acgaacctaaacagttcgagacttctcaagctgttgggaaagaagcaggccgaatattttggtgggaacgtgaactcggtcata 
1 MHLNSSRLLXIiLGKKQVEYFGGNVHLVl 
19853 ccctcgcgactaattttaggtgcttctgtaetaatcagcgtgatatgcgcttga 19800 
29 PSRLILGAFVLISVICA- 
dplOWtt78 

11924 aegacaactgtcgaccaatttaaaagacagetgaggaaaagtctaggctcaatttttcctccatcagtttccttaaatttgagc 
1 HTTVDQFKRQLRKSIiGSIFPSSV-S L~_U -L S 

11840 caatcagtaacctttagcgaattgctagcacttgccccccatattaagtcataa 11787 „ 
29 QLVTFSELLALASHIKS* 
dplORF179 

56058 atgggtagggttactccccacctcgttgatttgctctatgcaaaacctaccacaatcgcttgtcgcggcttcaggagttgcatt 

1 MGRVl PYLVDLLYAKPTTIACRGFRSCI 

56142 tcggataagtcaaaaagcaagtgcccttatattegacaagctcccgaacaa 56192 
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29 LDKSKSKCLYIRQALE * 

dplORPlBO 

41176 acgtccgacatgatttggaggaagttgctccctgctaaaatctgtcgaaccgctgaagccgtgtctactaaagaaatgcccgaa 
1 MFDMIWRKLFPVKI CRTAEVVSTKEMPE 

41092 aaagtaggacgtactgaatcggggatgttgaacccccatccgtttgaatag 41042 
29 KVGRTESGMLNLH PFE * 

dplORFlBl 

13126 atggaagctcctgtcccgcacctcccttttaaatattcgagaaattcaatattcccgaccacaactactctcaccctttgcggg 
1 MEVSVPYFLFKYSRNSIFPTITTLTFCG 
13042 ctatttaccgcaacttctgtcataggctgtcctcctttgcttatactgtaa 12992 
29 LFTATSVIGCPPLLIL* 
dplOR71B2 

45369 gtgctcgcccacgtttcaataaatagggttcgacctcgcctagctttcgaacgcgctataacgatttcaatcacagcgaagaaa 
I VLAHVSINRVRPRLAFERAITISIIAKK 
45285 ggtgagaagcttcaatcaattccaetgcggtgtcaatatcttcttccttga 45235 
29 GEKLQSI PLRCQYLLP* 

dplORF183 

13896 gtgactccagctcttggtttcccttcagccccttcaactttttcttccttaggcgcaggtttcttacgagttgaactcttaggc 
1 viPAFGFSSASSTFSSLGAGFLRVELLG 
13812 ttttcttcaactacttcttcaacctcagcctcttgttcaactggaccttga 13762 
29 FSSTTSST-SASCSTGP* 
dplORF184 

53330 gtgaact tgccgt caaccacgt caaacat etggt cct cgt cgaggc ct aaaat t agagt t cc aagaagt t cgctccttcc tgga 
1 vmlpsTTSNIWSSSRSKIRVP. RSSLFSG 

53246 aaatcttcaagagtagcactgtcttccggacgctctggaaggaattcacaa 53196 
29 KSSRVALSSGRSGRHS* 



dplORT185 

22522 atgaaattcgagatgttcgaaacgaaaatctacttattattagacactctagaaatggcgaagaaattgtcaaccacttctata 
1 MKFBMFBMKIYLLLDTLEMAKKIjSTTSI 
22606 tatttggaggaaaagatgagtcgagtcaagacctcatacagggggtaa 22653 
29 YLEEKMSRVKTLYRG* 
dplORF186 

21272 acgct cgaaaaact caac cggt t cgaaaac ctcaatcctt cgaaaagt cgaaccatt cgaaaagt t caaaagt t cgaaaaact c 
1 MLEKLNRFENLKPSKSRTIRKVQKFEKL 
21356 aaccattcgagagtaggaattaaggacaeaccagttcaacctttttag 21403 
29 UHSRVGIKDIPVQPF* 
dplORF187 

34415 atggtcttgttcaatctcctcctaccatcattcaagcagctgttcaaattatcactgctttattcaatggtcttgtccaggcac 
1 MVLPNLFLLSFKQLFKLSLLYSMVLFRH 
34499 ttcctacgcttattcaagcaggtcttcaaattctgtcagctctcataa 34546 
29 FLRLFKQVFKFCQLS* 
dplORF188 

35609 attgttcgtaaagcagccggttcgcctcgagtggacttgttcaatacaggaagtgacaaccctaaccaacctcagtcacaatcta 
1 MFVKQPVRLEWT CS I QEVTTLTNLSHNL 

35693 aaaacaatcaaggcgagcaaaccgtcgccaacatcggaacaatcgtag 3S740 
29 KTIKASKPLSTLEQS* 
dplORFlfl* 

425 87 atgcaaacgcagtatcaaccgtccctgaaactctccatgacccagacttgtatgctgcgaaccgtcgagaacttcgagctgacg 
1 MQTQYQPSLKLFMTQTCMLRTVENFELT 
42671 agcaaaaacttcgcgaaactcgctacgcaatcgaagatgaaattctag 42718 
29 SKMPAKLVTQSKMKF* 
dplORFUO 

39786 atgcattcactcaaagttgttcagtgtggctcaatcatatcaaaatcgaacctggtaatatctctactccttttagtgaagcag 
1 MYSLKVVQCGS1 I LKSNLVI SLLLLVKQ 

39870 aggaagaccttaaatatcgaattgacccaaaagccgatcaaaagccaa 39917 
29 RKTLNIBLTQKP IKS* 

dpiowm 

40996 atgtccattgttccggaacttgatttaggtaagtaccttgccaagtccagtgacggcgtaaaggatacgctagtagtatggttc 
1 MSIVPELDLGKYLAKSSDGVKDTLVVWF 
40912 ttacctaaacctaeccagtcgctaccgaaaactcggtaccaaacttga 40865 
29 LPKSIQSLPKTRYQT* 
dplORF192 

2920 acggtcgacgtcgaatgttttctcgagatgaagtttagggccttctcgataccctacggtatgttcagcgagtgctttaacaaa 
1 MVDVBCFFEMKFRVFSIPYGMFSECFHK 
2836 acggaatggagtatcttgcaacccgtcacgttctgcgtcctcgcceaa 2789 

29 TEWS ILQPVTFCVLA" 

dplORF193 

424S6 atgatttcagctcaaattaaacacgaaacgagacactgtccaaatttaaccaagaattatctacattcgattteaccacaagtc 
1 MISAQIKYEMRHCLNLTKNYLHSLSPQV 
42372 ctccgtcagtgtatatacatagaatggcatttccatatgagttattga 42325 
29 FRQCIYI EWHPHMSY* 

dplORF194 

40284 atgaacccttgcgtaaggtacataacttcatcccctgcggagaacatagaaattcggtcacttgataccttaatggcagagcta 
1 MNPCVRYITSFPAEMIEIRSLDTLMVEL 
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40200 ccgtegtccttaccgacaattagaccttcatcagaagagctcatgtaa 40153 
29 PSFLPIIRPSLEELM* 
dplORP195 

42584 atgttcacaaccgtcgtcttgacaagcttcttttcagcccctcgtccaacagtgaactctgccacaatctggcgcgacttcgta 
1 MPTIVVLTSFFSAPCPIVNSATIWRQPV 
42500 aggttcaacatagtcctcacctccLttctaaaaaatattataacatga 42453 
29 RPNIVLTSFZ.KNIIT* 
dplCRF196 

11273 atggtagatttaacaagtccctgtccaatcatgtcactcctccttgcccatcaaaagaagtttggtttcaattatcggcccagc 
1 nVDLTSPCPlHSLLLAHQKKFGFUYRFS 
11189 attaggctcccatttaacaactccagcaageecattcatttcttctag 11142 
29 IRLPFNNSSKFIHFP* 
dplORM97 

7484 atgaaaagattatatggtatccaatttcaagccttgaaaaaattaaacggtctggagttaaaagcgtcaacccaaacttcatcg 
1 MKRLYGIQFQALKKLNGLELKASTQTSS 
7558 atgcagggtatgaagcttcecacaagaagcgtcgaaccagactga 7612 

29 MQGMKFLTRSVBtiD* 



dplORF198 

24119 atgccgcccaacaaattgacgtccagccttattcaatgcctcagttcacctatacagttgaccccagaaacccttccagcttgc 
1 KPLNKLTSSPtQCLSSPIQLTLBTLPAC 
24203 tttctgttgacactgtttatcaggacgagcgtacaaaaggaatga 24247 
29 FLL.TLFXRTSVQKE* 
dplORF199 

15742 gtggctcctgaattaggctgtactttccctcccaactgctcagcaactgccttctcttgtttagcactagccctgcgcgtggga 
1 VAPELGCTPPPNCLATAFSCLALALRVG 
15658 attggtttgtatgcgcgtgatgtcatggcagataggcgaggataa 15614 
29 IGLYARDVMADRRG * 

dplORT200 

47843 atgacaggcttgtatccgacaagccctgaaagtttttcacacatttcttccgtctcggcttcgtcaactaatttttcgataatt 
1 MTGLYSISPESFSHISSVSASSTNFSII 
47759 tctttcaagcgttcttcgtccatagttgagcgctctgtcgtgtag 47715 
29 SFKRSSSIVERSVV* 
dplORMOl 

38569 atgggctccacaagttccctctttaaccaaaggtcaacatctttggactcgaactacttggacctataccgattcaactaccga 
1 MGFTSSFPNQRSISLDSNYLDLYRPtJYR 
38653 aacgggctaecaaaaaacctacattccaaaagacgggaatga 38694 
29 HOLSKMLH8KRRB* 

dpKnraoa 

44483 gtggggcgtttattetttataaaaattcttracaaaatgcttgacaacatCcaetcattatcgcataatacaattaKaaaaata 
1 VGRLPFIKIFYKMLDNIHSliSYNTIIKI 
44567 aataaagccgaaaggcgaggaggacatcatgtcaaaaattaa 44608 
29 JfKAERRGGHYVKS* 
dplORT203 

22781 gtgactaggattggccgggttacaagagaaccacattttcgaacctgttacggaacagcgccetgtcgcttggttgacaaacga 
1 VIRIGRVTREPHFRTCYGTAPCRLVDKR 
22697 ttcaggcatcagtgccacctcatcacagaagatacctgccaa 22656 
29 FRKQCHLITEDTC* 
dplORP204 

1471 atgaccacggttcgagtcaagggacggttgttgacttttatcacgtcaagaaaatcgcaggcacatccattgacagacttgacc 
1 MTTVRVKGWLLTFITSRKSQVHSLTDLT 
1S55 acgctgttettcttcaagggaatgaaccaatcgctctag 1593 

29 TLFPPKGMNQSL* 
dplOR»205 

8524 gtgacactgatgaatggttctcagttcggtacgccactcgcgacgcagatatctcctacgaccaaagaactgcccaatctagaa 
1 VTLMWGSQFGMLLVTQ1 SSTTKELPNLE 

8608 ttcaggaaaagcaacctgctateaagttcaatttcgtag 8646 

29 FRKSMLLSSSIS* 
dplORP206 

19855 atgaccaagttcacgcccccaccaaaacattcgacccgcttctctcccaacagcttgagaagtctcgaactgctcaggttcacc 
1 MTKFTFPPKYSTCPFPMSLRSLBtiFRPI 
19939 aaattgttcaaettgagcaagtgcgatattattetttag 19977 
29 KLPNLSKCDIIL* 
dplOR7207 

27502 gtgtcggtggtggtgtteccgaacctcgtgaaaccggcccttttggtaagcaacttgccgccgctcaacaaacggcaggagcac 
1 VSVVVFPMLVKSALLVSNLLLLNKRQBH 
27586 aagaacaatcatcattctttaaataataggaggaactaa 27624 
29 KNMHHSLNNRRN • 

dplOR?208 - _ - 

47279 atgtttggtatgaagcaaaagacctcgctgaagaaaacaacattcacttcccgtctgttcttcctgaacctagaacagaccttg 
1 MPGMKQKTSLKKITFTSRLFFLKI.BQTL 
47363 accatcgtggttctcgattctgggatgacgaaggcgtga 47401 
29 T1VVLOSGMTKA* 
dplORMOP 

29784 atgttaagaaccaagctcgtagagccattgaaaccgcccctactaaaatcaaggtacttcgaaactcttgggtcagtgatggat 
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1 MLRIKFVEPLKPLLLKSRYFETLGSVMD 

29868 atggaggaaagaaaaaggataaagcgaatgaagtcgtag 29906 

29 MEERKRI KRMKS * 
dplOR7210 

53077 atgctteaacetttcccgtatcatggttgtaaagttgaagaaatagcecttcaatacgagggaacccgtcttggcataatggae 

1 MFQLFPYHGCKVEEIVFQYEG IRFGIMD 

52993 aattatcaggatggactgttcccccgtcttcgccaatag 52955 

29 NYQDGLFPRLRQ* 



dplOR72U 

20959 gtgctcgacttttatgtcgcccctaatttttgtttttacttacggaccatgggatttgtaggtattttcagggcgcttttttat 
1 VLDPYVAPNFCFYLRTMGPVG I FRALFY 

20875 ttacttattaagtccetctctatattagattgtctataa 20837 
29 LLIKSFSI L D C L • 

dplOR7212 

S29S3 atggactgtctccccgtcttcgccaatagcactgcaattgatatagcgtcgacgaccgtcaacgtctgcttcgcggactacgaa 
1 MDCFPVPANSIAIDIASTTVNVCFVDYE 
S2899 ataatccatgtcttcgccttccgggccatcatacaatag 52861 
29 IIHVFAFRVIIQ* 
dplORF213 

30291 atgcgcctttgcgtattcttccatcttagtagcagcgactccgcagactgttatgacagcgactcgaaacccgttccgacaccg 

1 MRLCVFFHLSSSDFADCYDSDLKLVSI P 

30207 ttcacagttactaacaaattcttcaggcttccatactaa 3 0169 
29 FTVTNKFFRLPY* 
dplORI214 

24273 atgatgccaaagttgtttttcagtgctcattccttttgtacgctcgtcctgataaacaatgtcaacagaaagcaagctggaagg 
1 MMPKLPFSAHSFCTLVLINNVNRKQAGR 
24189 gcttctagggtcaaccgtataggcgaactgaggcattga 24151 
29 VSRVNCIGELRH* 
dpIORMIS 

35822 atgttaccaaaccctgatagagtctccttacttccattatacaatcctctcgacagttcgtcaacgtcgccaccgtttcgaact 

1 MLPNPDRVSLLLLYNPLDSLSTSSLFRT 
35738 acgattgttecaatgttgacaacggttegctcgccttga 35700 
29 T-IV'PMLTTVCSP* 
dplORF216 

32849 atggcctcggagctcgcggccacacctcctccagatacggcagccaggtcaagtacccccggcatagcgcccacgatttcattt 
1 MASELAATSPPDTAARSSTPGIASMISF 
3276S acctggaaaccggctgaagctagattttccataccttga 32727 
29 TWKPAEARFSIP* 
dplOR7217 

23443 atgaatactatgcttacagcegggacagtaaagcgagccaaaegggagaagatagagtcattaaagagcatgaccactgcatgg 

1 MNTMLTAGTVKRAKREKIBSLKSMTTAW 

23527 ataggaacagatatgcctgtctcactgacgctctaa 23562 
29 IGTDMPVSLTL* 
dplORP218 

22029 atggaatgct cccggaagaggt c cgatat agact acaaat tgagcgcgagaaaat caca tt get ccgggccaaaacgggcgacc 
1 MECFRKRFDIDYKLSARKLHCSGPKWAT 
22113 aggaaattgaaggcgaggtcaaagacaacttcgtag 2214 8 
29 RKLKARLKITS* 
dplORF219 

51388 atgattttatgctcgactttttcagttctcccatttcttcgaaacgcttcagggctgacgccttgcctaactacttcgctagat 
1 HILCSTFSVLPPLRNASGLTPCLTTSLD 
51304 gttccaaaattccttttcagccactggtttccatag 51269 
29 VPKFLFSHWFP» 
dplORT220 

6334 gtgaagttttcttcggtgacggttgatacaatttccttcaagagcaagctgttaaggtggcaagcgaattctttccccgaaact 
1 VKFSSVTVDT I S FKSK LLRHQVNS FFBT 

6250 ttcttgccagcagatgcgtacatgatgtcttcacaa 6215 

29 FLPADAYMMSS • 

dplORT221 

43507 acgactgctcaagctctacgtactatgctctccgcccagccggagcttcaagtgccggatgggcagtcaatactgagtacatgc 
1 MTAQVLCTMLSAQPBLQVLDGQSILSTC 
43 591 acgcaeggcetattgaaaacggttatgaactaa 43623 
29 THGLLKTVMN* 
dplORP222 

13212 gtgacggtaccgagaaecttatggattggctcgaaaatgataccaatttcttctcaagtccagcaagcactcgataccatggaa 
1 VTVSRTLWIGSKMI PI SSQVQQALDTME 

13296 gctatgaaggtggacttgtcgagcactcattaa 1332 8 

29 AMKVDLSSTH* " ~ 

dplORF223 - 

14055 acgcggcggtacctgctggacacgttcgagatgcccaccactcctacagcgaagccgctgacgtttaccacaagaaagatgccg 

1 MWWYLLDHFEMSTTSTVKSLTFTTRKMS 

14139 acgagcctgacgatgacagcgacattcttgtag 14171 

29 TSLTMTATFL* 
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dplOR7224 

13621 atgccagaaaattgct tgagcttcaactggcgtgagttgaatgaaacgttgaagaaggaaat t agate ttgcaccatgtcccat 
I MPENCLSFNWRELNETLKKE I RFCTMSH 

13S37 tgtaagctgcccagggtcgcattcatatgctaa 13505 
29 CKLLRVVFIC* 
dplORF225 

32991 gtgagcaacgggtgcgacgtatttcatcgcctctgccatgtcgctagtttctgcgttcgtatcagctgctgctcgagcaaatac 
1 VSNGCDVFHRLCHVASFCVRISCCSSKY 
32907 gtcagccacgtgaccegcctggtttgcctctaa 32875 
29 VSHVTRLVCL* 
dplORF226 

25191 gtggctgcgcacattagtttgaacttcagtgagcgcaagttgcttagcagaaagttcatcgctaggaattggatagtggtgttc 
1 VAAYISLNPSERKLLSRKPIARNWIVVF 
25107 gatagtcattgtcgtaagtgtttgataacttga 25075 
29 DSHCRKCLIT* 
dplORP227 

23115 atgact caat tagat ggt agege 1 1 atgacgt 1 1 cgaga atccataaaggccgaaggt t gt t gcat t at agat ac caaagt cgc 
1 MTQLDGSAYDVSRIHKGRRLLHYRYQSR 
23031 ctgctacgaataaacggtcgaattctatattga 22999 
29 ULRIN GRILY* 

dplORF228 

10450 atgttcgaaacattattgaagattctagatacaagtctatggacagcgagttcaaagtttacatcattgacgaggttcatatgc 
1 MFBTLLKILDTSLWTASSKFTSLTRFIC 
10S34 tttcaaccggagcatttaatgcgctgttga 10563 
29 FQPEHLMRC* 
dplOBF229 

27634 atgtgcgagttaagaaaactgattttaatcaaaccactcgaagcattgtcgcaattcctgaccactacgttgctttggctgctc 
1 MCELRKLI LIKPLEALSQFLTTTLIiWLL 

27718 aaattccagctaccgcagcaactcaagtag 27747 
29 KFQLPQQLK* 
dplORT230 

50723 gtgacgaaaaatccggcacacttgaactatctgtcgttaaaaaccgatatggcgaagaccgaaaaatcatcgaatatatgtggg 
i VTKNPAYLNYLSLKTDMAKTEKSSNICG 
50807 acgttgaaactggaacctatactcttatag 50636 
29 TLKLEPILL* 
dplORF231 

31071 atgcgcgtgtcattgcgtttcacatcttcagttccctccgaggtcacggcttcgagttctgctgtttctgccgtatctacgaca 
1 MRVSLRFTSSVPSEVTASSSAVSAVSTT 
30987 aagttagctccgcegacttttggcaactga 30958 
29 KLAPFTFGN* 
dplORF232 

29385 atgtcaattccattagctcttgctaattcaacgagctcaggaacggttttagccgcatactcttcgcgcatttgttcaacttcg 
1 MSIPLAIiAMSTSSGTVLAAYSSRICSTS 
29301 tcaatttctccaactgattcaattgtttga 29272 
29 SISSTDSIV* 
dp!08F233 

52892 atgt ct t cgect t c egggt ca t cat acaat agagtgacaat tgcgctgt caccgt ggt cagegagt gtgaaaaact cgt t at t a 
1 MSSPSGSSYNRVTIALSPWSASVKNSLL 
52808 gaccctgagctaaatgttcctgatttttga 52779 
29 DPBLNVPDF* 
dplORF234 

36253 atgcttacgagtacagcgactcaactgttcgaaaggtttataagtttcaacccgctttgggaggcgatagcttacctaacccag 
1 MLTSTATQLFERFI SFNPLWEAIAYLTQ 

36337 gaagacctactcgacaatttagagtag 36363 
29 BDLLDNLE* 
dplORF235 

32766 atgaaat catggacget atgccaggggt act tgacct ggc t gecgt at ctggaggagatgt ggc cgegaget ccgaggccatgg 
1 MKSWTLCQGYLTWLPYLBBMWPRAPRPW 
32852 ctagttcacttcgagcctttggattag 32878 
29 LVHFBPLD • 

dplORF236 

37528 atgttcgtcgcttttagatttagcaatatatcgaggcttcatgtggcgtgtagtaaaccacgaaacatcaatgagatattcact 
1 MPVAFRFSNISRLHVACSKPRNIMBIFT 
37444 tccattgttgatagaagcaaacgttaa 37418 
29 SIVDRSKR* 

dplORF237 

1678 gtgagagtccaggtaaggaatcttgacatattctcagccgtagttctaaatccaaatagaactcgcttggtgtcaaCTgcattt 
1 VRVQVRNLDIPSAVVLNPNRTRLV-STAF 
1594 gctaaagcgattggttcattcccttga 1566 

29 AKAIG5FP • 

dplORF238 

1301 atgcctttttgcggtcgatacaagttgcgcaagttccacaactttcagcgtcactttcataacatgaacgagtcaagaaataag 
1 MPFCGRYKLRKFHNFORHFHNMMBSRNK 
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1217 gaacatctaaatcaattccccatttaa 1191 

29 BULMQFPI* 
dplORF239 

26521 atggcgaagtatttcctatcgaagaacgtcctttcgaccaccctaacggaatgtgctaccaaactgtatggtacgaaaactcac 
1 MVKYFLSKNVLSTI LMECATKLYGTKTH 

26605 tcgaagaaatcgctgatgagttga 2662 S 
29 SKKSLHS * 

dplORF240 

41893 atgtttggaacaagcgtgaaacagagcttacacggcgaagcaacaaacacgaggacaaccctacgggaactcgaggcgaatggg 
1 MFGISVKQSLHGEVTNTRTTLRELEVNG 
41977 gactatttcaaaatttctggttag 42000 
29 DYFKISG* 
dplORP241 

47020 gtgtctttccttaatatggagatagtcttcattctatttaagcaggatatcgaaaaggttaccaattttagatttcataggctt 
1 VSFLNMEIVFILFKQDIEKVTNFRFHRL 
46936 accatctacgatataatctgctaa 46913 
29 T I Y D I I C • 

dplOR7242 

41338 gtgtctgtaacccatgctcttacggtagcggagccattaaagttcatcatacccaacttgccgccgtttccgtcgatagcttgg 
1 VSVTHALTVAEPLKFI 1 PNLP PFSLIAH 

412S4 tttttacctacgagctcagcgtga 41231 
29 PLPTSSA* 
dplORF243 

51306 atgttccaaaattcrttttcagccactggtttccatagaaccctccatcgtttcgacctaataeattcgagacgaattcagtta 

1 MPQNSFSATGFHRTLHRPDLIHSRRIQL 

51222 gtcctgaagtgtagccgcaagtga 51199 
29 VLKCSRK* 
dplORF244 

27063 gtgaggtacaaaatgctgaccgtcgccgccaatgaaaattttagcatcgagttctttcgaagttttcgaaacaatttccttcac 
1 VRYKMLTVAVNENFSIBPFRSFRNNFLH 
26999 ctgtttgatagttggttcatctag 26976 

29 LPDSWPI* 
dplORT245 

6278 gtggcaagtgaattctttcttcgaaactttcttgccagcagatgcgtacatgatgtcttcataactgctagtagaagtrtttaat 
1 VASBFFLRNFLASRCVHDVPITASRSFN 
6194 tcgaageeggtctttcaagaataa 6171 

29 SKSVPQB* 
dplORF246 

2831 acggagtatcttgcaacccgtcacgttctgcgtcctcgcctaatagaccaaaaagtctttgaacggctgcctcagtattgtcca 
1 MBYLATRHVLRPRL1DQKVFBRLPQYCP 
2747 aggttacaatttcatccggcttaa 2724 

29 RLQFHPA* 
dplORP247 

29641 gcgacgcagactactggaaacaaatggcgcaactctatcacgaccaataeaagcaagaacagcttgaaactgatgaaaagtcga 
1 VTQTTGNKWRNS I MTtTI S KN S LKLMKSR 

29725 acgctggttcgacaatcttaa 29745 
29 TLVRQS* 
d P 10RF248 

53560 gtgcaaagcctcgttctagcaagaagaacgatgctcagttacttgctcaaeggaaaaacaggaagcctgcagttgaggtcactt 
1 VQSLVLARRTMLSYLLNGKTGSLQLRLL 
53644 acatttcaggaaacgctctaa 53664 
29 TPQBTL* 
dplORF249 

2012 gtggatgcgactatcattgcaaccggtgtgacccagcctttacctggaacggtactactgagccggaatatatcacaggcaaag 
1 VDATIIATGVTQPLPGTVLLSRMISQAK 
2096 aagctgctagtegaatcttga 2116 

29 K L L V B S * 



dpl0RP250 

23837 atgggcaaaca t ggaagat tgacgaagact cagt cgac cat aaacctact cgagaaa t 1 cgaaac tat at tcgacaact t at ca 

1 MGKHGRLTKTQSTINLLBKFETIFDNLS 

23921 aaaagcaatcaegctttatga 23941 

29 KSHHAL* 
dplORP251 

39205 atggaaataattagtcttaccgtccgcgcctggcttcccgggtatcccttgagcteegtcattccccttccatttcgtccatgt 

1 MEIISLTVCAWLPGYPLSSVIPLPFRPC 

39121 ataggctgcagggtcttttga 39101 

29 IGCRVF* ... 

dplORF252 _^ — 

54771 gtgttgtataggtcgaaactaatttcgcatattctctatatttcaaaagtgcttttgagatatcgttateaaaatgctcgacaa 

1 VLYRSKLILHIFYISKVLLRYRYQNARQ 

54687 tactttcgcetgttcctctag 54667 

29 YFRLFL* 

dplORF253 

56255 atggttgcgtctataatagaaccgatgttgccagacaaagcatttgcaatcttcgagtctaatttattcgagagcttgtcgaat 
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1 MVASIIEPMLLDKAFAIFESNLF'ESLSN 
56171 ataaagacaettgctttctga 56151 
29 I K T L A F * 

dplORF254 

48479 atgaaccettcgcttaggttcaatctttttcgaacactttcaeatttaacaaaactttcagctaaaaaccgacaaagttcaaeg 
1 MNLSLRFNLFRTFSYLTKLSAKNRQSSM 
48395 ttcgactcaatgtttaaataa 48375 

29 F D S M F K • 

dplOKF25S 

9572 atgctttggtcttctcgacgaatgaccctactacattccctgcagggtttcgagcagtacgggccaatgatgcaccgttttcgt 

1 MLWSSRRMTLLHSLQGFEQYGSHNHRFR 

9486 caaggtagtcaccttttctaa 9468 

29 Q G S H L F * 

dplORP256 

15289 atgaccctccagtcactaatgcggccgctgaaattggataccactatacatgggttcaccaacctcgagacaaagcagttgaaa 
1 MTFQSLMRPLKLDTTIHGFTNFBTKQLK 

15373 cacctgaagaaaccccag 15390 

29 H L K K F * 

dplOR7257 

2821S gtgaacgtgctggatttagcaaacaagctactgagatggcattcttccgtgagtctatgcgacetggtgaaaaagaccgtcaaa 
1 V NVLDLANKLLRWHSSVSLCDLVKKTVK 

26300 acttgcaaatgctattga 28317 
29 T C K C Y * 

dplORP258 

44023 acggaaat t ggt at tggt t cgaccgc gacggat acatggct acgt catggaaacggat t ggcgagt catggtact ac t tcaatc 
1 MEIGIGSTVTDTWLRHGNGLASHGTTSI 
44107 gogaeggttcaatggtaa 44124 
29 A H V Q W * 

dplOR7239 

4298 atgactcgactacgaagcataaagacaagtggacggaaagagtattcgaagttattcgaaacagctctaatccagacgttaaga 
1 MTRLRSIKTSGWKEYSKLP8TVLIQTLR 
4382 ctcacgcatttgggatga 4399 

29 L T H L G * 

dplORF260 

24746 gtgaecctacttcctcaaecggcggtaccggaggcaagcaagcccaagtcacttccatttcaggaaactceaacttccttecag 
1 VTLLPQSAVLEAS KLKSLPFQETSTSPQ 

24 830 cggctgaatattatttag 24847 
29 R L N I I * 

dplORF261 

288 atgaattcacttccctttgccctaaaacaggacagcctgacttcgcgaatgttttcattagttacattccaaacgaaaagatgg 

1 MNSLPPALKQDSLTSRMFSLVTFQTKRW 

372 ttgaatctaaatcattga 389 

29 L N L N H * 
<JplORP2«2 

9408 atgcctattcaactccaggcggaaagatgtggaagcatgcttgtgcagtccgacttaaatttagaaaaggtgactaccttgacg 

1 MPIQLQAERCGSMLVQFDLMLEKVTTLT 

9492 aaaacggtgcatcattga 9509 

29 K T V H H • 



dplORP2S3 

27052 aegaaaatcttagcaccgagtectttcgaagctttcgaaacaatctcctecacctgtttgacagccggtccatctagacctttt 

1 MKILASSSFEVFEI ISPTCXiIVGSSRPP 

26968 aacaagtcctctaattga 26951 

29 N K 3 S M * 
dplORF264 

6139 gtgaatagtacaaggcggtctaacacgctcaggatttccgctgtagggatagccgcatcatcttcaaactcaattgagtcaagc 

1 VNSTRRSNTLRISAVGIAASSSSSISSS 

6055 tgtgaaacgtcttcaeaa 6036 

29 C S T S S * 
dplOR7265 

4801 gtgaataaagtcaagcgtttttgtataaaaagttcatttctttttaaaaaaaataagagcgaaaagctcttatctaaaatagtc 

1 VHKVKRFCIKSSFFFKKHKSEKLLSKIV 

4717 gacgttgacgatttttaa 4700 

29 D V D D F * 
dplORT266 

50220 atgcccgttcttccaagcagttgcaagcacttcaccaatagtccacgacttaccttgtccaggtcgagccattatgacaatcaa 

1 MP VLPSSCKHFINSPRLTLSRSSHYDNQ 

50136 atcctcaccaggaagtaa 50119 

29 I L T R K * ~ 

dplORP267 - 

47367 atggtcaaggtctgttctaggttcaggaagaacaaacgggaagegaatgttattttcttcagcgaagtcttttgcttcatacca 

1 MVKVCSRFRKNKREVNVIFPSBVFCF1P 

47283 aacattaatcgtagatag 47266 

29 N I N R R * 

dplORF268 
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12621 atgtcaatttcggccccgtgctcgacaatggattcaactaccgatgcgccaacctttttcaatcgcgacagcttgtccaattca 
1 MSI SVLCLTMDSTTDASTFFNROSLSNS 

12S37 ttgccaattccagagtaa 12520 
29 h S I L B • 

dplORF269 

53334 gtgaatagtatcgagtccatcagtttctacgtcaatagaacctattccgtcttcaatcattttgtctacatactgctegagttt 

1 VNSIESISFYVNRTYSVPNHFVYILLBF 

53750 tgcttcctcagtgattaa 53733 

29 C F L S D * 

dplOR7270 

50792 atgatttttcggtcttcgccatatcggtttttaacgacagatagttcaagtatgccggatttttcgtcacgcttcatagcgata 
1 MIFRSSPYRFLTTDSSSMPDFSSRFIAI 
50708 actetgctageattttga 50691 
29 T L L A P • 

dplORF271 

19739 atgaggctgctttgctttatcttcgttaccgtattgaccgacttcctactcgcgaaccttcccacaagaattcatacctcaaag 
1 MRLLCFI PVTVLiTDFLLANLPTRlHTSK 

19655 gctttttgtcagccttag 19638 
29 A F C Q P • 

dplORF373 

1556 gtggtcaagtctgtcaatgaatgtacctgcgattctctcgacgtgataaaagtcaacaaccatcccttgactcgaaccgtggtc 

1 VVKSVNECTCDFLDVIKVNMHPLTRTVV 

1472 ataagttccgcctgctaa 1455 

29 I S S A C * 
dplOR7273 

56256 atggacttcattaggactgagtcctcttggaattggaacggttgcatatatagatattccgccagccgtactaggccaagttct 

1 MDFIRTESSWNWNGCIYRYSVSRTRPSS 

56340 agctcagtttatcccgcagtcaatcgcttcgagatatctgaaaaagtagtcaggaaaattcctgattatcttgcagtcaattgc 

29 SSVYLAVNCFEI FEKVVRKI PDYI»AVNC 

56424 ttcgagatatttgaaaaagtagteaggaaaattcctgattattttttttacaaaaacgcttga 56486 

57 FEIFSKVVRKIPDYFFYKHA* 



WO 00/32825 



PCT/IB99/02040 



398 



Table 31 



Query* aid|ll4822 1 lan|dplORF00X Phage dpi ORF|36698-40390}2 
(1230 letters) 

>gi | 928828 (L44S93) ORP1904; putative [Lactococcus lactis phage BK5-T] 
Length = 1904 

Score • 427 bite (1086), Expect * e-118 

Identities - 226/475 (47%), Positives - 281/475 (58%), Gaps = 45/475 (9%) 

Query: 3 95 AESGKYIGVLNTNKKPSELVPDDrTWIRLBGPKGDAGLPGAPGRDGVDGVPGKSGVGIAD 454 

A+ YIG + P D+TW + +G+ G GA G+DGV GK GVGI 
Sbjctj 620 ADYPSYIGQYTDFIQYDSAKPSDYTWSLI RGNDGKDGATGKDGV AGKDCVGIKT 873 

Query: 4S5 TAITYAVSVSGTQEPENGWSEQVPBLIKGRFLWTKTFWRYTIXjSHETGYSVAYIGQDGNS 514 

T ITYA+S SGT +P GW+ QVP L+KQ++LWTKT W YTD 3 ETOYSV YI +DGN+ 
Sbjct: 874 WITYAl^SSGTDKPMTGWSQVPTLVKGQYLWTKTVWTYTDSSSBTGYSVTYIA 933 

Query: 515 GKIXSIAGmrVGIAATBVm'ASSPSATEAPAGGWSTQVPTVPGGQYLWTRTRWRYTDQ^ 574 

G DGIAGKDGVGI T + YA ST APA GW++QVP VP GQ+LWT+T H YTD T 
Sbjct: 934 GNDG IAGXDGVGIKXTTI TYAVGTSGTTAPASGVWSQVPWVPAGQFTjWTKTVWTYTDWrS 993 

Query: 575 EIGYSVSRMGEQGPKGDAGR DGIAGKNGIGLKSTSVSYGISPTDSAIP-GVHASQVP 630 

E GYSV+ MG +G KGD G +GIAGK+G G+K+T+++Y SP + P G W++ VP 
Sbjct: 994 ETGYSVAMMGVKGDKGDPGNKGTNGIAGKDGKG I KATAI TYQAS PNGTTAPTGTWSASVP 1053 

Query: 631 SLIKGQYLWTRTIWTYTDSTTETGYQKTYIPKDGNDGK^^ 690 

+ KG + LWTRTIKTYTD +TTETGY Y+ +GN+G +G GKDG GIK+TTITYAGST 
Sbjct: 1054 PVAKGSPLWTRTIWTYTDNTTETGYAVAYMGTNGNNGHI^ m 3 

Query: 691 SGTVAPTSNWTSAI PNVQPGFFLWTKTVWNYTDDTS BTGYSVSKIGETXXXXXXXXXX3CC 750 

SGT P + HTS +P V Q +LWTKTVW YTD +TSBTGYS V+ +0 
Sbjct: 1114 SGTTPPNNGWrSTvT^TVABGim 4 >mcrVWTYTO VKGDKGDP 1167 

Query: 751 XXXXXXXXXXADGRS - QYTHLAFSNS PNGEGPSHTDSGPJV YVGQYQD FHPVHSKDPAAYT 809 

DG+ + T ♦ + SPNG A G + P +K +T 

Sbjct 2 1168 GNKGTNGI AGKDGKG I KATAITYQAS PNGT TAPTGTWSASVP PVAKGSFLWT 1219 

Query: 810 WTKW KGNDGAGGIPGKPGADGKTNYFHIAYASSADGS 846 

T W GN+G G PGK 0 KT I YA S G+ 

Sbjct: 1220 RTIOTYTDKTTETGYAVAYMGTIKINNGHDGFPGKDG^ 1272 



Score - 396 bits (1007) , Expect ■ e-109 

Identities - 208/449 (46%), Positives - 260/449 (57%) , Gaps - 42/449 (9%) 

Query: 421 IRLEGPKGDAGLPGAPGRDGVDGVPGKSGVGIADTAITYAVSVSGTQEPBNGWSGQVPBL 480 

♦ ♦ G KGD G PG +G +C+ GK G GI TAITY S +GT P WS VP + 
Sbjct: 1155 VAMMGVKGDKG DPGKNGTNGIAGKDGKG I KATAITYQAS PNGTTAPTGTWSASVPPV 1211 

Query: 481 IKGRPLWTKTTWRYTIXJSHETGYSVAYIGQDGNSGK^ 540 

KG PLWT+T W YTD + ETGY+VAY+G +GN+G DG GKDG GI T + YA 3 S 
Sbjct: 1212 AKGSFMITRTIWTYTDWITETGYAVAYMGTNGNNGHDGFPG 1271 

Query: 541 TEAPAGGWSTQVPTVPGGQYLWTRTONRYTDG/TD EIGYSVSRMGEQGPKGDAGR DGI 597 

T P GW++ VPTV G YLWT+T W YTD T E GYSV+ MG +G KGD G +GI 
Sbjct: 1272 TTPPNNGWTSTVPTVAEGNYLimaVWTYTDNTS 1331 

Query: 598 AGKNGIGLKSTSVSYGISPTDSAIP-GVWASQVPSLIKGQYLWTRTIWTYTDSTTETGYQ 656 

AGK+O G+K+T+++Y SP + P G W++ VP ♦ KG +LWTRTI WTYTD+TTETGY 
Sbjct: 1332 AGKDGKG I KATAI TYQAS PNGTTA PTGTWS AS VP PVAKG S FLWTRT I WTYTDNTTETGYA 1391 

Query: 657 KTYIPKDGNDGKNGIAGKDGVGIKSTTITYACSTSGTVAPTC^ 716 

Y+ +GN+G +G GKDG GIK+TTITYAGSTSGT P + WTS +P V G +LWTK 
Sbjct: 1392 VAYMGTOGNNGHDGFPGKDGTGIKTTTITYAGSTSGTT^^ 1451 



Query: 717 TVWNYTDDTSETGYSVSKIGETXXXXXXXXXXXXXXXXXXXXXXADGRS-QY^ 775 

TVW YTD+TS ETGYSV+ +G DG+ + T + + 3 

Sbjct: 1452 TVWTYTDNTSETCYSVAMMG VKGDKGDPGKNGTNGIAGKDGKGI KATAITYQAS 1S0S 
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Query: 776 PNGEGFSHTDSGRAYVGQYQDFNPVHSKDPAAYTWTKW KGHD B17 

PNG AG* P +K +T T W GN+ 

Sbjct: 1506 PNGT TAFTGTWSASVPPVAKGSFLWTRTIWTYTDNTTETGYAVAYMGTNGNN 15 57 

Query: 818 GAQGIPGKPGADGKTNYFHIAYASSADGS 84 6 

G G PGK G KT I YA S G+ 
SbjcC: 155 8 GHDGFPGKDGTGIKTT- -TITYAGSTSGT 1584 



Score • 384 bits (977), Expect ■ e-105 

Identities » 179/322 (55%), Positives - 222/322 (68%), Gaps » 7/322 (2%) 

Query: 421 I RLEGP KGDAG LPGAPGRDG VDGV PGKSG VG I ADTAITYAVSVSGTQE P ENG WSEQVP EL 480 

+ + G KGD G PG +G +G+ GK G GI TAITY S +GT P WS VP + 
Sbjct: 1311 VAMKGVKGDKG- - - D PGNNGTNG I AGKDG KG I KATAITYQAS PNGTTAPTGTWSAS VP PV 1367 

Query: 481 IKGRFLWTKTFTOYTDGSHETGYSVAYIGQDGNSGKDGIAGKDGVGIAATEVMYASSPSA 540 

KG FLWT+T H YTD + BTGY+VAY+G +GN+G DG GKDG GI T + YA S S 
Sbjct: 1368 AKGSFLWTRTIWTYTD^^TETGYAVAYMGTNGNNGHDGPPGK13GTGIKTTTITYAGSTSG 1427 

Query: 541 TEAPAGGWSTQVPTV PGGQYLWTRTRWRYTDQTD E IG YSVSRMGEQG P KGDAGR - - - DGI 597 

T P GW++ VPTV G YLWT+T W YTD T B GYSV+ MG -*3 KGD G +GI 
Sbjct: 1428 TTPPNNGWTSTV PTVABGNYLWTKTVWTYTDNTS BTGYSv^^ KGD PGNNGTNG I 1487 

Query: 598 AGKNGIGLKSTSVSYGISPTDSAI P-GVWASQVPSLIKGQYLWTRTIWT YTUS 'l'r tf r U *Q 656 

AGK+G G+K+T+++Y SP + P G W++ VP + KG +LWTRTIWTYTD+TTBTGY 
Sbjct: 1488 AGKDGKGI KATAITYQAS PNGTTAPTGTWSASVPPVAKGS FLWTRTIWTYTDNTTETGYA 1547 

Query: 6S7 KTYI PKDGNDGKNG I AGKDGVG I KSTT ITYAGSTSGTVAPTSNWTSAI PNVQPGFFLWTK 716 

Y+ +GN+G +G GKDG G IK+TTITYAGSTSGT P + WTS +P V G +LWTK 
Sbjct: 1548 VAYMGTNGNNGOTXJFPGIOX3TG I KTTTITYAGSTSGTTPPNN 1607 

Query: 717 TVWNYTDDTSETGYSVSKIGBT 73 8 

TVW YTD++ ETGYSV K+G T 
Sbjct: 1608 TVWAYTDSSFBTGYSVGKMGNT 1629 



Score - 201 bits (507) . Expect » 2e-50 

Identities - 121/297 (40%), Positives = 156/297 (51%), Gaps - 19/297 (6%) 

Query: 421 IRLBGPKGDAGLPGAPGRDGVDGVPGKSGVGIADTAITYAVSVSGTQSPBHGWSEC^PEL 480 

+ + G KGD Q PG +G +G+ GK G GI TAITY 5 +GT P WS VP + 
Sbjct: 1467 VAMMGVKGDKG - - - D PGNNGTNG I AGKDGFffl I KATAITYQAS PNGTTAPTGTWSASVPPV 1523 

Query: 481 IKSRFLWTKTFHRVTTCSHETGYSVAYTGQDGNSGKDGIAGKDG^ 540 

KG FLWT+T W YTD + ETGY+VAY+G +GN+G DG GKDG GI T ♦ YA S S 
Sbjct: 1524 AKOSFLWTRTIVrrnDNTrBTOYAVAYMGTNGNNGHDGFPGlOXJ^ 1S83 

Query: 541 TEAPAGGWSTQVPTVPGGQYLOTRTRWRYTDQTO 600 

T P GW++ VPTV G YLWT+T W YTD + E GYSV +MQ GP AG +G GK 
Sbjct: 1584 TTPPN1JGWTSTVPTVAEGNYLWT1CTWAYTDNSF 1640 

Query: 601 NGIGLKSTSVSYG I S PTDSAI PGVWASQVPSLI KO - QYLWTRTIWTYTDSTTE- - TGYQK 657 

♦ T+ G++ S++ ++G+YW W + G 

Sbjct: 1641 VVSDTEPTTKFTCGLTWKYSGVVDHPljGNGTKIIAGTEYYWNGNNW 1700 

Query: 658 TYIPKDGNDGK-NGIAGKDGVGIKSTTITYAGS TSGTVAPTSNWTSAI PNVQ 708 

+ DGK I G +GV ♦ T T GS +S + TNTAINQ 

Sbjct : 1701 8VTNGTFKDGKIESIWGSNGV NGTTTXEGSHLQIHSSDSTTNTEN-TLAXDNRQ 1753 



Query. sid| 114823 | lan|dplORF002 Phage dpi ORF| 32386-35835 {1 
(1149 letters) 

>dbj|BAA31888| (AB009866) orf IS (bacteriophage phi PVL) 
Length « 694 

Score - 280 bits (709), Expect » 3e-74 

Identities * 157/465 (33%). Positives » 257/465 (54%), Gaps = 28/465 (6%) 

Query: 40 Q I GSALTG LG KGLTTAVT LPLMG PAAAS I KVGNE FQAQMSRVQAI AGATAEE LGRMKTQA 99 

+IG+++ +G+ +T VT A + K G EF M +V+A +GAT EE +K +A 

Sbjct: 151 EIGNSMKNVGRNMTMYVTAPWAGFAVAAKKGIEroDSMRKVKATSGATGEE 210 
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Query: 100 IDI/yUCTAFSMCEAAQGMENlASAGFQVNEIto^ 159 

++GA T FSA ♦ * +A AG* + GV+DL + L 

Sbjct: 211 REMGATTKFSASDSAEALNYMAIAGWDSKQMMEGLSGVMDUAAA^GEEliCAVSDIVTDGL 270 

Query 5 160 RAFGLEANQAG^ADVTAJLAAAtfnMJ^ 219 

AFGL+A +GH+ADV A+ ♦+ N + + EA KYVAPVA ++G ++E+T+ +IG+M+ 
Sbjct: 271 TAFGUCAKDSGHUu}VIAQTSSICMmJWGI£^ 330 

Query: 220 DAGIKGSQAGTTLKCiAI^RIAKPTKAMVXSMQELGVSFYT3ANGNMIPlJlEQIAQLKTATA 279 

+AGIKG +AGT LR + ++ PT+AM M+ LG+S D+NG MIP+R+ + QL+ 
Sbjct: 331 NAGIKGEKAGTAlJiTMFT^SSPTRAMGNEWERI^ISITDSNGKMIPMHKLLDQIJ^^ 390 

Query: 2 BO GLTQEERNRHLVTLYGQNSLSGMLALLDAGPEKI^lG^rNALVN^ 339 

T++G+ *+SG LA+++A E K+T ++ +S GA+K MA+TM+ L 
Sbjct: 391 HLSKDQQASSAATIFGKEAMSGALAI INASDEDYQKLTKSIDSSTGASKRMADTMBSGLG 450 

Query: 340 SKIEQMGGAFESVAIIVQQILEPALAXIVGAITKVLEAFVNMSPIGQKMWIFAGKVAAL 399 

K+ + E +A+ + +EPAL IV A +KV+ + Q W F VA L 

Sbjct: 451 GKIJllljI^QLEELALTrYDRIEPAIJCIIVSAFSKVVTWVTKLPTSIQIAVVGFGLFVAVL 510 

Query: 400 GPLLXjIAGM VWTTIVXLRIAIQFLGPAFMGTMGT1AGVIAIF 441 

GPL+ ♦ G+ MT + L Z ♦ P IA +*■ +F 

Sbjct: 511 GPLVFMFGLFISVMGHAinVljGPIitKVlIKASGLPAFIiRTKlASLVXLFPlLGVSISSLT 570 

Query; 443 YALVAV- - - PMIAYTJCSERFRMFINSLAPAI KAGFGGA 476 

ALV + F AY +SE FRK +N + FA 

Sbjct: 571 LPITLIVGALVGIGIAFYQAYKRSETFRNIVHQAISGVANAFfWA 615 



Query- sid| 114024 1 lan |dpiORF003 Phage dpi ORF| 53536-55877 1 3 
(779 letters) 

>sp|P4374l|DP01_HABIW DNA POLYMERASE I (POL I) >gi 1 1074025 |pir| | B64098 DMA polymerase I 
(polA) homolog - Haemophilus influenzae (strain Rd KW20) 

>gl | 1573871 (U32767) DMA polymerase 1 (polA) 
(Haemophilus influenzae Rd] 

Length • 930 

Score * 191 bits (481). Expect - le-47 

Identities • 14B/5S3 (26%), Positives - 262/S53 (46%), Gaps • 60/553 (10%) 

Query: 63 RLBUTEBAKLEQYVDKMlEIXSIGSIDVBTDGIi^ 122 

+ E ♦ +A L +++*K> * ++D ETD LD + L G+ ♦ + V P+ 

Sbjct: 333 KYETtXTQ^LTRWIEIOiRAAJXIAVircSTO 392 

Query: 123 SNMTKMRII04QISPEPMKKJ41^RIVDSGIPVIY^ 182 

+ +X +L+ + I I N KFD +SI+ R G+++ +DT h 
Sbjct: 393 YLDAPKTLEKSTALAAIKPILE NPNIKKIGQNIXFD-ESIFARHGIELQGVEFDTML 448 

Query: 193 AAMLIJfENESHSIJCSlJiSKYVRjraENASV^ 242 

+ LN H++ L +Y+ +E A ♦ + P* IP ♦ A YAA O T 

Sbjct: 449 LSYTLNSTGRHNMDDLAKRYIjGHET IAFBS LAGKGKSQLTFNQI PLEQATE YAAEDADVT 508 



Query: 243 FBLYEFQEQYLTPGTEQCEBYNI^iCVSKVLKtflBMPLrKVI^ 302 

+L ♦ U E Y +B+PL* VL MB GV +D D L 

Sbjct: 509 MXLQQALWLKIiQBE PTLVBLYK TMELPLLHVLSRMERTGVLIDSDALFMQS 5S9 

Query: 303 EQFTANKNBAEQEFQQLVSEWQPBIEEIJlQTNFQSYQKI*EMDARGRVTVSISSPTQIiAIL 362 

+ + + B++ L + QIj + 

Sbjct: 560 NEIASRiTALBKQAYALAGQ -PFNLASTXQLQBI 592 

Query: 363 PYDIMGLKSPERDKFRG TGESIVEH- «FDNDISXXXXXXXXXXXXVSTYTT-LDQHL 416 

♦D ♦ h *+ P+G T E ++E ♦ +++ STYT 1 Q ♦ 

Sbjct: 593 LFDKLELPVLQKT - PKGAPSTNEEVI^ELSYSHBLPKILVKHRGLSKLKSTYTDXLPQMV 651 

Query: 417 AXPDNRIHTTFKQYGAKTGRWSSENPNLQMIPSRGE-GAVVRQIFAASEGHYIIGSDYSQ 475 

R+HT++ Q TGR+SS +-PNLQNIP REG +RQ P A EG+ 1+ +DYSQ 
Sbjct: 652 NSCTCRVHTSYHOAVTATCRLSSSDPHLQMIPrRNEEGRHIRQAFIAREGYSIVAADYSQ 711 

Query: 476 QEPRSIAELSGOESKRHAYEQHLDLYSVTGSIOjYGVPYEECLEPYPDGTTNKEGKIiRRIIS 535 

E R +A LSGD+ + +A+ Q D++ +++*GV +E T+++ R ♦ 
Sbjct: 712 IELRIMAHLSGOQGLINAFSQGKDIHRSTAABI FGVSLDE VTSEQ RRN 759 



Query: 536 VKSVIJ£I>JYGRGANSIAEG*DJVSVKEANKVIEDFFT£PPKVADYI IFVQQQAQDliGYVQ 595 
K++ GL+YG A +♦ Q+ +S +A K ♦+ +F +P V +♦ ++++A+ GYV+ 
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Sbjct: 760 AKAINFGLI YGMSAFG LSRQLG I SRADAQKYMDLYFQRYPS VQQFMTD IREKAKAQGYVE 819 

Query: 5 96 TATGRRRRLPDMS 606 

T GRR LPD+ + 
SDjct: 820 TLFGRRL YLPD I N 832 

Score - 46.9 bits (109), Expect » Se-04 

Identities « 34/123 (27%). Positives - 66/123 (53%), Gaps « 16/123 (13%) 

Query: 663 EIKDQAKAEGI LIKDNGGKIADAQRQCLNSVIGGTAADMTKYAMIKV 709 

+I+-M-AKA+G + N + A+R ♦N* +QGTAAD+ K AMIK+ 

Sbjct: 807 DIREKAKAQGYVETIJGRRLYLPDINSSHAMRRXGAERVAINAPMQGTAADIIKRAMlia 866 

Query: 710 HNtUlELKELGFHLMIPVHDEIiGEVPIKNAKRGAERLTEVMIEAAKDIISLPMKCDPSIV 769 

++ + +++ VHDEL+ EV + B++ ♦ M EAA +++ +P+ ♦ + 

Sbjct: 867 -DEVlRHDPDIEMIPIQVHDELVrEVRSEKVAFFREQIKQHM-EAAAELV-VPLIVEVGVG 923 

Query i 770 ERW 772 
♦ M 

Sbjct: 924 QNW 926 

Query- flid|ll4 825 | lan|dplORF004 Phage dpi ORF| 40401-42440 | 3 
(679 letters) 

>emb|CAB07981| (293946) hypothetical protein [bacteriophage Dp-1] 
Length - 532 

Score ■ 1011 bits (2585) , Expect « 0.0 

Identities - 497/499 (99%) , Positives - 498/499 (99%) 

Query: 1 MTXFINSYGPLHLNLYVEQVSQDVTNNSSRVSWRATVDRI^ 60 

MTXFINSYGPLHl^LYVEQVSQDVTNNSSRVSWRATVD^ 
Sbjct: 1 MTKPXNSYGPLHI/rcjYVEQVSQDVTNNSSRVSWRATVDRDGAYRTWTy^ 60 

Query. 61 SSTOSSHFDYIJTSGEEVTIASGBvTVPKNSDGTKTMSVWASro 120 

SSVHSSHPDYDTSGBEVT1ASGEVTVPHNSDGT7074SVWASPD 
Sbjct: 61 SSVHSSHPDYirrSGEEVTIASGEVTVPHNSlXrrKTKSVWASro 120 

Query: 121 DSIPRSTQISSFBGNRNLGSLirrVTFNRKVNSPTHQVWYR^ 180 

DSI PRSTQISSFEGNRNLGSLHTVIFNRXVNSFTHQVWYRVFGSDHIDLGK^ 
Sbjct: 121 DSIPRSTQISSFEGNRNLGSLHTVlPiiRXVNSFTHQVWYRVTGSDW 180 

Query: 181 PSLDLARYLPKSS SGTMDX CIRTYNGTTQIGSDVTSNGWRFNI PDSVRPTFSG ISLVDTT 240 

PSLDLARYLPKSSSGTMD ICIRTYKGTTQIGSDVYSNGWRFM1PDSVRPTPSG I SLVDTT 
Sbjct: 181 PSl^LARYLPKSSSGT>ffl I CIRT^GTTQIGSDVYSNGHRFNIPDSVRPTFSGI SLVDTT 240 

Query: 241 SAVRQILTGNNFLQIHSNIQVNFNUASGAYGSTIQAFHAELVGKNQAINENGGKLGMMNF 300 

SAVRQILTGNKP1>QIMSNIQVNFNNASGAYGSTIQAFKAELVGKNQAINENGGKLG>1MNF 
Sbjct: 241 S AVRQI LTGNNFLQ I MSN I QVNFNNASG A YG ST I QAFHAELVGKNQA I NENGGKLGMMNF 300 

Query: 301 NGSATVRAWVTDTRGKQSNVQDVS INVI EYYGPS INFSVQRTRQN PAI IQALRNAJCVAP I 360 

NGS ATVRAWVTDTRGKQSWVQD VS I NVI EYYGPS INFSVQRTRQN PAI IQALRNAKVAP I 
Sbjct: 301 NOSATVRAHvTDTRGlCQSWVQDVSIMVIEYYGPSINFSVQRTRQNPAlIQAl^^ 360 

Query: 361 TVGGQQIOTIMQITFSVAPLNTTNFTEDRGSASGTFr^ 420 

TVGGQQKHIMQITFSVAPLNTTNFTEDRGSASGTFTTISL+TNSSANIJU3NYGPDKSYIV 
Sbjct: 361 TVGGQQKNIHQITFSVAPLinTNFTEDRGSASGTFTriSI^TNSSANl<AGNYGPDKSYIV 420 

Query: 421 KAKIQDRFTSTEFSATVATESVVLNTOKIXJRLGVGKVVEQGKAGSIDAAOT 480 

KAKIQDRFTSTEFSATV TESWLNYDKDGRLGVG KWEQGKAGS I DAAGD I YAGGRQVQ 
Sbjct: 421 KAKI QDRFTSTE FS ATVPT ESWLNYDKDG RLGVGKWEQG KAGSI DAAGO I YAGGRQVQ 480 

Query: 481 QFQLTDNNGALNRGQYNDV 499 

QFQLTDNKGALNRGQYNDV 
Sbjct: 481 QFQLTDNNGALNRGQYNDV 499 



Query- sid|H4B27|lan|dplORF006 Phage dpi 0RF| 45296-46987 | 2 
(563 letters) 

>gb|AAD18987| (AE001666) SHI/SNF family helicase_2 (Chlamydia pneumoniae] 
Length » 1166 

Score ° 171 bits (429), Expect « le-41 

Identities - 150/522 (28%), Positives « 254/522 (47%), Gaps - 55/522 (10%) 
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Query: 4 6 SSNNFE- LPYKYFNNVIDAIJDEWELHIFGEIXiKXJVQDYIDSRNRIASSSNEQFSFKTTPF 104 

S ♦ FE LP ♦ «■+ + h E + I GE + + D QD + T 

Sbjct: 659 SLDQFEALPWF- -SMSERLIEIQKQIRGEIEFQFQD VPQQIQATLRSYQTEG 709 

Query: 105 AHQVECFEYAQEHPCFLI^DEQGUJKTXQAIDIAVSRKASFKH- -CLIVCCISGLKWNWA 152 

H +E + H *L D+ GLGKT QAI IAV++ K C +♦ C + L +NW 

SbjcC: 710 VHWLE- -RLRKMHLNGILADDMGLGKTLQAI - XAVTQSKLEKGSGCSLIVCPTSLVYNWK 766 

Query: 163 KEVGIHSNESAHIU3SRVTKIX;iaVIIXV-SKRAEDI^GHDEFFLITNIETUlDAVPIK 221 

+E ♦ S LVIOGV S+R ♦ L D IT+ L+ V 
Sbjct: 767 E EFRKFNPEFR - TLVIDGVPSQRRKQLTALADRDVAITSYNLLQKDV 812 

Query: 222 YliNBLTKSGEIGMVIIDEIHKCKNPSSXC^IQKLQSYYXMG^ 281 

Bla KS V++DE H KN +++ S++ +QS +++ LTGTP+ N+ +♦+++ 

Sbjct: 813 ELYXS FRPDYWLD EAHH I KNRTTRNAKS V KM I QSD HRL I LTGT P I ENS L E E LWS L F 869 

Query: 282 KWI/SAJSHHTLTQFKERYCIVDQFNQITGYR- - NLAELRELVNDYMLRRTKEEVL -DL 335 

+L L +R+ V ♦+ + Y N+ L++ V+ ++LRR KE+VL DL 

Sbjct: 870 DFLMPG---LIaSSYDRF--VGKYIRTCNYMGNKADIMVALK^ 924 

Query: 336 PEKIRVTEYVDMNSKQSKIY KEVLTKLVQE I DKVKLM PNPLAETI RLRQATGN 388 

P + + ♦ Q ++Y K+ L++LV++ *+ + LA RL+Q ♦ 

Sbjct: 925 PPVSEILYHCHLTESQKELYQSYAASAKQELSRLVXQEGFERIHlHVlATLTRLJCQrCCH 984 

Query: 389 PSILTTQDVK SCKFERCIEIVEECIQQGKSCVIFSNWEKVIEPLAKIL-SKTVKCNL 444 

P+I + 3 K++ ++++ t G V+FS ♦ K++ + K L S+ + 

Sbjct: 98S PMFAKEAPEPGDSAKYDMLMDLI^SLVDSGHXTVW 1044 

Query: 445 VTGETADKFNEIEEFMNHRXASVILGTIGAliGTGFTLTKADTVIFLDS PWTRAEKDQAED 504 

+ G T ♦+ + + +F V L A GTG L ADTVI D W A ++QA D 

Sbjct: 1045 LDGSTKNRLDLVNQFNEOPSl^VFLI SLKAGGTGI^LVGADTVIHTOMWWNPAVENQATD 1104 

Query: SOS RCHRIGAKSSVTIYTLVAKGTVDERI EDLI ERKGELAOYIVD 546 

R HRIG SV+ Y LV T++S+I L RK L 
Sbjct: 1105 RVHRIGQSRS VSS YKLVTLNTI EEKILTLQNRKKS LVKKVIN 1146 



Query» aid) 114828 | lan| dplORF007 Phage dpi ORFj 22230-23621)3 
(463 letters) 

>gi | 2444105 (U88974) 0RF26 [Streptococcus thermophilus temperate bacteriophage 
01205] 

Length - 411 
Score » 88.9 bits (217), Expect « 7e-17 

Identities • 60/315 (25%), Positives - 133/315 (411), Gaps ■ 48/315 (15%) 

Query: 139 QGVTLAGI FCI3EVAIJ4PES FWQATGRCSVTGSKMWFSCNPANPNHYFKJQTHIDKQVEKR 198 

♦G T G + +E +L E + RCS G*++ + NP UFNH+ ♦♦+! K + + 
Sbjct: 121 RGFTAFGAYVNEAS LANELVFKE I ISRCSGDGARWTOSNPDNPNHWLNREYIGKN- DGK 179 

Query: 199 ILYLHFTMDDNPSLT DSIKRRYEKMYAGVFRKRFILGLWVTADGLVYSMFNEEQHV 254 

1+ F +DDN L+ DSIJC K G F R ILGLW A+G +Y+ ++ ♦ HV 
Sbjct: 180 IIDFSFKLDDNTFLSKRYIDSIKAATPK GKFYDRDILGLWTVAEGA1YADYDSKIHV 236 

Query: 255 KKLKIEFDRLFVAOTFGIYNATTFGLYGFSKRHJ^YHLIESY^fHSGREAEEQLTEADVNS 314 

E R P D+G + ♦ + G ++L++ +B ♦ + +A 
Sbjct: 237 VDBLFEMKRYFGGIDWGYTHYGS I VIVG- EGVDNNFYLVIXjVAAQFKEIDWWVEQA 291 

Query: 315 Kl QFS SVLQKTTKEYANDLVDMI RGKQ I EYI I LDPS AS AM I VELQKHPYIAR - - -KN1PI 371 

♦K T Y N + + ++AR ♦ I 

Sbjct: 292 RKLTGIYGN IPFYADSARPEHVARFENEGFDI 323 

Query: 372 I PARNDVTLGZSFHAELLAENRFTLDPSNT - HD I DEYYAYS WDS KASQTGEDRV I KEHDH 430 

+ A V GI A+L E ♦ ♦ DE Y Y W ++ +D +KJB D 
Sbjct: 324 MMANKSVIAGIEL1AKLFKEKKLYVKRGFVPRFFDEIYQYRWKENST KDEPLREFDD 380 

Query: 431 CMDRNRYACLTDALI 445 

+D RYA +D +1 
Sbjct: 381 VLDSVRYA1YSDYV1 395 

Query- aid| 114829 | lan|dplORF008 Phage dpi ORP|49624-50961|1 
(445 letters) 

>gb|AAD1990l| (AF100420) Dnafl replication for* helicase [Thermus aquaticua) 
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Length ■ 444 
Score » 67. s bits (162) , Expect * 2e-10 

Identities - 69/248 127*), Positives = 111/248 (43%), Gaps « 14/243 (5%) 

Query. 147 GERMISTGFEXXXXXXXXXXXKXXXIVI^PG^ 205 

GS G+ TGF+ I I ARP GK+ + ♦ A K G V +YS 

Sbjct: 178 GEVAGVTITGFKELDQLIGTLGPGSLNI-IAARPAMGKTAPALTIAQNAALKEGVGVGIYS 236 

Query; 206 GEMSEMQVGARIDTI LSNVS IMS ITKG IWNDKQFEKYEDHI QAMTEAENSLWVTPFMI G 265 

EM Q+ R+ + + +N 4 G D F + D ++EA ♦ TP ♦ 
Sbjct: 237 LEMPAAQLTLRMMCSEARIDMOTVRLGQLTDRDFSRLVDVASRLSEAP- IYIDDTPDLTL 295 

Query: 266 GKHLTPAILDSMISKYRPSVVGIDQLSLMS--ESYPSREQKRIQYANITMDLYKISAKYG 323 

+ A ++S+ + t+ ID L LMS S S 8 ++ + A 1+ L ♦+ + G 
Sbjct: 296 ME--VRARARRLVSQNQVGLIIIDYI^IJ1SGPGSG£SGENRQQEIAAISRGLKA^^ 353 

Query: 324 IPIVLNVQAGRSAKTEGAESMELEHIAESDGVGQNASRVIAMXRD EKSGZLEL 376 

IPI+ Q R+ * ♦ L + ES + Q+A V+ + RD EK+GI E+ 

Sbjct: 354 IPIIALSQLSRAVEARPNKRPMLSDIiRBSGSISQDADLVMPIYRDEYYNPHSEKAGIAEI 413 

Query: 377 SWKNRYG 384 

V K R G 
Sbjct: 414 IVGKQRNG 421 



Query- sid) 114831) lan |dplORF010 Phage dpi ORF| 8699-9859) 2 
(386 letters) 

>gij 3760912 (AF037258) RecA protein rchlorobium tepidum) 
Length - 346 

Score » 133 bite (331) , Expect - 2e-30 

Identities - 99/340 (29%), Positives » 164/340 (48%) , Gaps • 66/340 (19%) 

Query: 44 GGLPRKJIVVEFFGPESSGKTTSALDIVKNAQW^ 103 

GGLPR RV B +GPESSGKTT AL + AQ 
Sbjct: 67 GGLPRGRVTEIYGPESSGKTTLALHAIAEAQ KNG 100 

Query: 104 AVKBLEMQLDSLQEPIJCIVYLDLENTIiDTF/tfAKXIGVD^ 163 

+ L +D B> D +A+K+GVD++ ♦ ♦ + PE S E+ L V 

Sbjct: 101 GIAAL - VDAEHAFDPTYARKLGVDINAIAVSQPE--SGEQALSIVB 143 

Query: 164 DIEXrGEVGLVVU3SLPYKVSQHLIDEELT!0<AVAGISAPLTEFSRKVTPLLTRYNAIFL 223 

+ +G V ++V+DS+ +V Q ++ B+ ♦ RK+T +++ *♦+ L 

Sbjct: 144 TLVRSGAVDIIVIDSVAALVPQAEI^EMGDSVra 203 

Query: 224 GINQIREDMNSQYNA-ySTPGGKMWIG^CAVRLXFRKGDYIJDENGASLTRTARNPAGNW 282 

IHQ+R+ + Y + VP GGK K +VRL RK + ++G L GM 
Sbjct: 204 FINQLRDK3GVMYGSPETTTGGKALKFYSSVRLDIRXIAQI -KIX3BBLV GNRT 255 

Query: 283 ESFVEKTKAFKPDRKLVSYTLSYKDGIQIEITOLVDVAVBFGVIQKAGAWFSIVDLETGEI 342 

+ VKK PR + + Y+GJ+ +L+D+AVEFG+I+K+GAWFS G 
Sbjct: 256 WnCVVTCNKV-APPmABFDILYGEGrSVMELIDIAV^FGIIKKSaAW 312 

Query: 343 MTDEDEEPLKFQGKANLVRRPKEDDYLFDMVMTAVHEIIT 382 

QG+ N+ + KED+ L + + V +**T 
Sbjct : 313 QGRENVKKLLXEDETLRNTI RQQVRDMLT 341 

Query- sid ( 114832 | lan) dplORFOU Phage dpi ORF| 28017-29096 |3 
(359 letters) 

>gi | 2444110 (U88974) ORF31 [Streptococcus therraophilus temperate bacteriophage 
O120SI 

Length « 348 
Score - 187 bits (469), Expect . le-46 

Identities - 118/358 (32%), Positives - 187/358 (511), Gape w 21/358 (51) 

Query: 3 1 YDYINAGB J ASYI QALPSNALQYLGPTLF PNAQQTGTDI S WLKG ANNLPVT I QPSNYDA 62 

I YD + A IA Y AL N US ++FP +Q GT *S++KGA+ V ++ ♦ +D 
Sbjct: 4 IYBKVTASNXAGYFNALQEHVS STLGE S I F PARXQLGTKLS YXKGASGQS VALKAAAFDT 63 

Query: 63 KASLRERAGFSKQATEMAFFRESOTLGE^RQNI^MLI^QSSA-I^QPLITQLY>TOTKNL 121 
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++R+R +M FF+E+M ♦ E DRQ L ♦+ ♦ +A L *♦ ++ND L 

Sbjct: 64 hn/TIRDRVSAEMHDEQMPFFKEAMLVKElTORQQLNLVia)SGNAVLVin , IVACIFMDNLTL 12 3 

Query: 122 VIXJVEAQAEYMRMOLLQYGKFTVXSTNSEAQYTYDYNMDAKQQYAVTKKVn^ 181 

V+G A+ E MRMQ+L GK S YD K+Q V+K W P + P+A 

Sbjct: 124 VNGARARLEAMRMQVLATGKIAFTSDGVNKDIDYGVKPDHKKQ--VSKSWAEPG-ATPLA 180 

Query: 182 DX LAAMDDI ENRTG VTt PTRMVLNRNTYNQMTKSDS I KKAL - AIGVQGS WENFLLLASDAE 240 

D+ A+ + G+ P R V+M T> + K+ S K + + GS ♦ ++ E 

Sbjct: 181 DLEDAI - ETARELGLNPERAVKNAXTFGLZRKAASTV1CVIKPLAGOGS AVTKAELE 235 

Query: 241 KFIAEKTGWIAWSKKIAQFADADKLPDVGNIRQFNLIDDGKVVLLPPDAVGHTWYGTT 3 00 

+IA+ G+ X ♦ ♦ DG++F DG + L+P +G+T +GTT 

Sbjct: 236 NYIADNFGVSIVLENGTYRN DKGEVSKF- - YPDGHLTLIPNGPLGNTVFGTT 28S 

Query i 3 01 PEAFDLASGGT - DAQVQVLSGGPTVTTYLEXH PVNI ATWSAVMI PSFEGI DYVGVLT 357 

PE DL «• T +A+V+++ G VTT PVN+ T VS V +PSFE +D V +LT 

Sbjct: 286 PEESDLFADNTVNAEVEIVDKGIAVTTTKTTDPVNVQTKVSMVALPSFER1J)DVYMLT 343 



Query- aid) 114834 | lan|dplORF013 Phage dpi ORF| 10215-11240 | 3 
(341 letters) 

r 

>sp|P0 9122|DP3X_BACSU DNA POLYMERASE III SUB UNITS GAMMA AND TAU 
Length - 563 

Score - 182 bits (458) , Expect - 26-45 

Identities - 118/3S3 (33%), Positives - 176/353 (49%), Gaps - 31/353 (S%) 

Query: 7 YRPQTFEEVVAQEYVKEILLNQLQNGAIKHGYLFCXXXXXXXXXXXRIFAKDVN 60 

+RPQ FE+W QE++ + L N L H YLF +IFAK VM 

Sbjct: 10 FRPQRFEDVVGQEHITKTLQNALLQIOXFSHAYLFSGPRGTGKTSAAKI FAKAVNCEHAPV 69 

Query: 61 KGL GSPIEIDAASNMGVEMVRN1IBDSRYKSMDSBFKVY1ID8VH 105 

KG+ IEIDAASNNGV+ +R+I + ++ +KVYI IDEVH 

Sbjct: 70 DBPCMECAACKG1TNGSISDVIEIDAASNNGVDE2RDIRD1CVKFAPSAVTYKVYIIDEVH 129 

Query: 106 Ml^TGAFNALIJCTLEEPSSGTVFILCnTDPQKIPDTILSRV 165 

MLS GAFNALLXTLEEP +FIL TT+P KIP TI+SR QRPDF RI ♦ IV ++ 
SbjCt: 130 MLS IGAFHALLKTLEEPPEHCI FI LATTEPHKI PLTI I SRCQRFDFKRI TSQAI VGRMNK 189 

Query: 166 1 1 E S ENEEGAG YS YERDALSF I G K1ANGGMRDSITRLEKVLDYSHHVDMEAVSNAL G 222 

I+++E E +L I A+GGMRD+++ L++ + +S D+ V +AL G 

Sbjct: 190 IVDAEQ LQV^GSLBIIASAAHGGMRDALS1XDQAISFSG--DILKVEI)AI«LITG 242 

Query: 223 VPDYETFASLVEAIANYIJGSKCLEI VNDFHYSGKDLKLVTRNFTDFLLE^ S 282 

L + + S LB +N+ GKD + + + ++ Y + 

SbjCt: 243 A VSQLY I GKLAKS LHDKNVS D ALETLNELLQQGKDP AKLI EDMI FYFRDMLLYXTAPGLE 302 

Query: 283 ITQLPAHFESKI^QFCEAFQYPTLLWMLEBMMEIAGWKHBPNAKPIIETKLL 335 

+ + E L M++ +N+ +KW ♦ + B 

Sbjct: 303 GVLEKVKVDETFREI^EQIPAQALYEMIDILNKSHQEMKWTNHFRXFFEVAVV 3S5 



Query- sid | 11483S | lan |dplORF014 Phage dpi ORP| 50961-51974 | 3 
(337 letters) 

>sp|P47492|PRXM_MYCGE DNA PRIMASE >gi | 1361496 | pir| | F64227 DNA primaae (dnaE) homolog 
MG2S0 - Mycoplasma genitalium (SGC3) >gi| 3844846 
(U39704) DNA primase (dnaE) [Mycoplasma genitalium) 
Length ■ 607 

Score - 57.0 bits (135), Expect - 2e-07 

Identities - 53/190 (27%), Positives - 89/190 (4S%), Gaps » 17/190 (8%) 

Query: 146 EELDKYRFIHP YMYERKLTDBLIEMFDVGYDK- - LHDCITPPVRNLKGETVFF 196 

E +++Y FI + P Y++K ++FD K +IP++GVF 

SbjCt: 170 ESMERYPFINPKIKPSELYLFS - KTNQQGLGFFDFNTKKATFQNQIMIPIHDFNGNPVGF 228 

Query: 197 NRRS VRS KFHQYG EDD P KT EFL YGQYELVAFRD YFEKP I SQVFVTESVINCLTLHSMKI P 256 

+ RSV ♦ + + EF ♦ ♦ EL+ K ++Q+F+ E + TL + K _ 

SbjCt: 229 SARSVDNINKLKYKNSADKEF - FKKGELLFNFHRLNKNLNQLFI VBGYFDVFTLTNSKFE 287 

Query: 257 AVAI*ffiVGGGN-QINLLKR--LPYRNIVLALDPDNAGQTAQBKLW^ 312 

AVALMG+ + QI +K ♦ +VLALD D +GQ A L +L ♦ +V + + 

Sbjct: 288 AVALMGIJu^NDVQ IKAI KAKFKELQTLVIALDNDASGQNAVPSLI EKLNNNNFI VB IVQW 347 
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Query: 313 PKEFYDNKWD 322 

* D WD 
Sbjct: 348 EHNYKD--WD 3SS 



Query- aid ( 114837 | lan f dplORF016 Phage dpi ORF| 43413-44303 | 3 
(296 letters) 

>emb | CAB07 98 6 | (z 93 94 6) N-acetylmuramoyl - L- alanine amidase (bacteriophage Dp-l] 
Length - 296 

Score » 661 bits (1686), Expect -0.0 

Identities » 296/296 (100%). Positives » 296/296 (100%) 

Query: 1 MGVDX EKGVAWMQARKGRVSYSMDFRDG PDS VDCSS SWYYALRSAGASSAGWAVNTEYMH 60 

MGVDI EKGVAWMQARXGRVSySMDFRDGPDSYDCSS SMYYALR SAGAS SAGWAVNTE YMH 
Sbjct: 1 MGVDIEXGVAWMQARKGRVSYSMDFRDG PDS YDCSS S WYYALR SAGAS SAG WAVNTE YMH 60 

Querys 61 AWL I ENGYELI SENAPWDAKRGDI FI WGRKGASAGAGGHTGMF IDSDNI IHCNYAYDGIS 120 

AWLI ENGYELI SENAPWDAKRGDIFIWGRKGASAGAGGHTGMFIDSDNI IHCNYAYDGIS 
Sbjct: 61 AWLI ENGYELI SENAPWDAKRGDI F I WGRKGASAGAGGHTGMF IDSDNI IHCNYAYDGIS 120 

Query: 121 VOTHDERWYYAGQPYYYVYRLTNANAQPAEKKLGWQKDATGFVfYARANGTYPlOJEFEYIE 180 

VNDHDERWYYAGQPYYYVYRLTNANAQPAEKKLGWQKDATGFWYARANGTYPKDEFBYIB 
Sbjct; 121 VNDHDERWWAGQPYYYVTRLTNANAQPAEKKLGWQKDATGFWYARANGTYPKDEFEYIB 180 

Query: 181 ENXCTFYFDDQGYMLAEKWLOTTDGNWYWPDRDGYMATSW1CT 240 

ENKSWFYFDDC^YMLAEKHlJOrrDGllWYWFDRDGYMATSWKRIGESW^ 
Sbjct: 1B1 ENKSWFYFDDQGYWIABKWIJaiTDGNWYWFDRDGYMATSWKRIGBSWYYF 240 

Query: 241 I KYYDNWYYCDATNGDMKSNAFIRYNDGWYLIAPDGRLADKPQFTVTIPDGLI^ 296 

I KYYDMWYYCDATNGDMKSNAFIRYNDGWYLIJ«PDGRLADKPQFTVBPDGLITA1CV 
Sbjct: 241 IKYYDNWYYCDATNGDMICSNAFIRYNDGWYLLLPDGR1JU3KPQFTVBPDGLITAKV 296 

Query- sid| 114841 | lan |dplORF020 Phage dpi 0RFii864-26S8|l 
(264 letters) 

>emb|CAB13247| (Z99111) similar to coenzyme PQQ synthesis (Bacillus subtilis] 
Length « 243 

Score • 217 bits (548). Expect - Se-56 

Identities - 117/248 (47%), Positives = 163/248 (65%), Gaps - 15/248 (6%) 

Query: 23 MPIMBI FGPTIQGBGMVIGQKTIFIRTGGCDYHCNWCDSAFTWNGTTEPB - - YITGKEAA 80 

+P++EIFGPTIQGEGMVTGQKT+F+RT GCDY C+WCDSAFTW+G+ + + ++T +E 
Sbjct: 5 IPVLEIFGPTIQGEGMVIGQKrTMFVRTAGCDYSCSWCDSAFTWDGSAKKDIRW>fTAEEIF 64 

Query: 81 S RI LKLAFNDKGEQI CNHVTLTGGNPAL INEPMAKMI S I LKEHGPKFGLETQGTRFQEWF 140 

♦ t- D G ♦HVT++GGNPAL+ + + I +LKE+ + LETQGT *Q+WF 

Sbjct: 65 ABL K3JIGGDAFSHVTISGGNPALLKQ-LDAFIELUCENNIRAALETQGTVYQDWF 118 

Query: 141 KEVSDITTSPKPPSSGMRTNMKILEAIVDRM- -NDENLDWSPKIVIPDENDLAYARDMFK 198 

+ D+TISPKPPSS MTN+L+I++ND S K+VIF++ DL +A+ + K 

Sbjct: 119 TXIDDLTISPKPPSSKMVTNFQKLDHILTSLQENDRQHAVSLIOAriFNDEOLBFAlCTVHIC 178 

Query: 199 TFEGKLRFVNYLSVGNANAY - - EEGKI SDRLLEKLGWLWDKVYEDPAFNNVRPLPQLHTL 256 

+ G YL VGN + + +♦ + LL K L DKV D N VR LPQLHTL 

Sbjct: 179 RYPG 1 pr/UJVGOTDVHTTDDQSLIAHLUJKYEALVDKVAVDAEI^LVRVLPQLHTL 23S 



Query: 
Sbjct: 



257 VYDNXRGV 264 

♦+ NXRGV 
236 LWGNKRGV 24 3 
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Query- sid| 114842 | lan |dplORF021 Phage dpi ORF| 2504-3295 | 2 
(263 letters) 

>8p| P19465 j GCH1_BACSU GTP CYCLOHYDROLASE I (GTP-CH-I) >gi | 98411 | pir | | A38256 GTP 
cyclohydrolase I (EC 3.5.4.16) - Bacillus subtil is 
>gi | 143231 (M37320) regulatory protein (Bacillus 
subciliflj >gi{ 143 799 (M80245) MtrA {Bacillus subtilisj 
>gi| 2634696 jemb|CAB14194| (Z99115) GTP cyclohydrolase I 
(Bacillus subtilis] 
Length - 190 

Score ■ 208 bits (523), Expect ■ 4e-53 

Identities • 103/185 (55%), Positives * 133/185 (71*), Gaps » 1/185 (0%) 

Query: 80 VTLDNTEAAVQRLFGLIjGEDAEIUXJLQDTPFftFVKA 139 

V + B +GED R+GL DTP R K AE G EDPK H + F +H 

ShjCt: 4 WKEQIEQATOQZLEAIGEDPNREGIJ^>TPXRVAX>fYAF 63 

Query: 140 SDLVLVKDIPFNSLCE^IAPFVGKVHIAYIPKD-KITGLSKPGRVVEGYAXRLQVQERL 198 

E+LVLVKDI F+S+CEHHL PF GK H+AYIP+ K+TGLSK R VE AKR Q+QER+ 
Sb jet : 64 BELVLVXDZAFHSMCEHHLVPFYGKAHVAYI PRGGKVTGLSKLARAVEAVAIG?PQLQERI 123 

Query: 199 TTOIADAIQBVUfPQAVAVIVEAEHTCMSGRGIX^ 258 

T IA++I B L+P V V+VEAEH CM+ RG++K GA TVTS +RG+F+DOA+ARAE+L 
Sbjct: 124 TSTIAESIVBTLDPHGVWWVEAEHMCmTOGVRKPGAKTV^ 183 

Query: 259 QLIXX 263 
+ IK* 

Sbjct: 184 EHIKR 188 



Query, aid | 114843 | lan| dplORF022 Phage dpi 0RF| 30896-31675 | 2 
(259 letters) 

»gi | 2347102 (U77367) intemalia {Listeria monocytogenes) 
Length - 821 

Score * 55.0 bits (130), Expect « Se-07 

Identities - 44/149 (29%), Positives • 63/149 (41%) , Gaps - 13/149 (8%) 

Query: 119 FRMNIYVFNYVG-OSIVNYVKITI^CTGXAPGLSIGKBFY 176 

F + VPN ♦ D + + HNTAPL YPB+K ♦ K ♦ 

SbjCt i 383 F S KTLSVPNN ITS I DCTLI APBTI SNNGTYDAPNL3CWSL PNYLP B - - VKYTFSQKI PIGT 440 

Query: 177 KSKDYVAQLPAVLR RVTFDLNGGTGTADAVRVEAGKKI S PKPVDPTLTGKAFKGW 231 

♦ +Y + L+ +VTF++ GT + V E +P+PPTGPGW 
Sbjct: 441 GTSKYSGPITQPLKBl^YKVTFNVBGNTSBVETVTEE NLIPEPTSPTKQGYTPDGW 497 

Query: 232 -KVEGBSTIWDFDNHMMPDRDVKLVAQPA 259 

E T WDF MP D+ L A F+ 
Sbjct: 498 YOABTGGTKTOPTTGQMPANDLTLYAHFS 526 



Score « 43.4 bits (100). Expect » 0.002 

identities - 47/195 (24%), Positives - 73/195 (37%). Gaps - 12/195 (6%) 

Query: 72 YDLTFKDNTFDPEIMALIEGGTVRQQGGTIAGYDT- PMLAQGASNWKPFRMNIYVPNY- - 128 

YD + T ♦ "+G * GG + T MA + F+NYM+ 
Sbjct: 547 YDALIiraPTTTTlCQGYTFDGWYDABTaWKMDFKTMKM 606 

Query: 129 ---VGDSIVNYVKITLrmCTGKAFGWIGKEFYAPE 185 

V + ♦ Y + T G+ ♦ A K TK+P + A 

Sbjct: 607 DGBVKNBTIAYDTLLNBmPTKCGYTFDGWTO 665 

Query: 186 P AVLRR VT FD LNGGTGT AD A VR VEAG KX I S P K P VD PT LTG KA F KG W - KVEGESTIWDFDN 244 

♦ FD++G T + V +A ♦ P+P P+ TG +GW B T WDF 
Sbjct: 666 TINNYQANFDIDGAV-TEEWNYDA LIPEPTSPSKTGFTLEGV/YDAEVGGTKWDFKT 721 

Query: 245 HMMFDRDVKLVAQPA 259 

MP D+ L A P+ 
Sbjct: 722 MKMFAND ITLYAHFS 736 



Score • 38.3 bits (87), Expect - 0.057 

Identities - 42/169 (24%). Positives - 59/169 (34%), Gaps « 10/169 (S%) 
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Query: 96 QCGOTIAGYTn"- PMLAQGASNMKPFRMNIYVPfTYVGDSIVNYVKIT LNNCTGKAPG 150 

+■ GOT + T MA + F+NYN+D+V + LNT 
Sbjct: SOX ETGGTKWDFTTGQMPANDLTLYAKFSVNSYQA^FDIDGVVTN^ S60 

Query: 151 I^IGREFYAPEFNIXAR£ATKAGLPWSMDYVAQLPAVIJUIVTFOLNGGTGTAOAVRVCA 210 

♦YE + +P + + A «• FO++G A 
Sbjct: 561 GYTETOWTOAETGGNICWDFKTMKMPANDVAFYAHFTIKNYQANFT>iDGEVXNETI A 616 

Query: 211 GKKISPKPVDPTLTGKAFKGW-KVEGSSTIWDFDNHMMPDRDVKLVAQF 2S8 

+ + P PT G F GW E T WDF MP DV L A P 
SbjCt: 617 YDTI^EPTTPTKQGYTFDGWYDAETGGTKWDFKTKEMPAMOVTLYAHF S65 



Query. sid| 114850 | lan \ dplORT029 Phage dpi ORF|662-134B{2 
(228 letters) 

>gi{ 2650185 (AE0C1074) succinoglycan biosynthesis regulator (exBB) 
(Archaeoglobus fulgidus] 
Length -239 

Score - 119 bits (295) , Expect « 2e-26 

Identiciee - 79/224 (35%), Positives » 113/224 (50%). Gaps • u/224 (4%) 

Query: 1 MCSVVLLSGGVDSATCIJUEVDKWGSKNVHAIAFOT^ 60 

MK+V+LLSGG+DS+T L +D G VHA+ F YGQKH E+E+A VA V+ 
Sbjct: 1 MKAVMLLSGGXDSSTLLYYIiLD- -GGYEVHALTFPYGQKHSKEIESAEXVAXAAKVRHLK 58 

Query: 61 LEIDSKIYXXXXXXLIiQGKGEISHGKSYAEIU^KEVVDTYVPFRN^ 120 

**I S X+ L Gt- 6+ Y+E ♦ ♦ T VP RH ++LS 
Sbjct: 59 VDI - STIHDLISYGALTGEEEVPKA- FYSSEVQRR TI VPNRNMXLLS - - IAAGYAV 110 

Query: 121 XXXXXXXXXXXXXXXXXXXPIXrrPBFYWSMSNAKEYGT -GGKVTL.VAPLLTLTXAQWKW 179 

PDC EF ++ A+ V + AP + +TKA +V+ 

Sbjct: 111 KIGAKEvTfYAAHLSDYSIYPDCRKEFVTCALDTAVYLANIVfrpVEv^ 170 

Query: 180 GIDLDVPYFLTRSCYESOABSCGTCATCIDRXKAFEENGKTDPI 223 

G+ L VPY LT SCYE C +C TC++R +AF WG+ DP* 

Sbjct: 171 GUOGVPYSLTWSCYEGGERPCLSCOTCLERTEAPIANGVTCDPL, 214 

Query- Bid| 114855 | lan |dplORF034 Phage dpi ORFil31-652l2 
(173 letters) 

>emb|CABl324S| (Z99111) similar to hypothetical proteins [Bacillus eubtilis) 
Length = 165 

Score - 220 bits (556) , Expect • 4e-S7 

Identities » 103/139 (74%). Positives « 117/139 (84%) 

Query: 5 TTRTOABLTGVT1JjGNQOTKYDYDYNPDv1*ETFPNXHPE 64 

TTR ++RL GVTLLGNQ T Y ++Y PDVLE+FPNKH +Y V P+ EFTSLCPKTGQ 
Sbjct: 2 TTRKESELEGVTLLGUJQGTNYLFEYAPDVXESFPNKHVVR£YF 61 

Query: 65 PDPANWXSYXPNBKMVESXSIJCLYI^SFRNKGOFHEDCMNXII^LYEUtEPKYlBVMG 124 

PUFA + + ISYIP+BKMVESKSLXLYLFSFRNHGDFHEDCMNII+NDL ELM+P+YIBV G 
Sbjct: 62 PDPAT1YI S^I PDEKMV^SKSlJCLYliFSFWWGDFHEIX^WI IMNDLIEl^TOPRYIBVWG 121 

Query: 125 L FTPRGG I S I YP FVNKVNP 143 

FTPRGGXSI P* N P 
Sbjct: 122 KPTFRGGXSIDFYTNYGKP 140 

Query- sid | 114857 | lan |dplORF036 Phage dpi ORF {48808-49362(1 
(184 letters) 

>gi| 1353529 (U38906) OR PI 2 (Bacteriophage rltl 
Length - 296 

Score - S3. 5 bits (126), Expect - le-06 

Identities - 42/149 (28%), Positives - 70/149 (461), Gaps - 9/149 (6%) 

Query: 34 IASKTVGNGKTSWAVllLLQRYIAETAIiDGRXV^KGMFWSAQLLTBFGDYNYP^TMQEFL 93 

+ S G GK+ A+ +L+ LTL ++ V ♦ F ♦ ♦ F ♦ + F+ 

Sbjct i 155 VVSGPAGTGKSHLAMSILKDCLQHTDLT- -VTFASWSEVLHLIKDSFDNKDSFYSTEYFM 212 
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Query i 94 ERFBRLKTCELI/VIDEICGGSIjTKASVPVLYDLVNYRVDNNLSTIYTrNYTDDEIIDliLiG 153 

E F * ♦LLVTD+IG *T* S L R TI TTN DEI 

Sbjct; 213 EVF RNTDLLVIDDIGSEKITEWSMSLLTEVLDART KTIITTNLKSDEIRXKYH 26S 

Query: 154 QRLYSRIYDTSVVLDFQASIJVRGLEVSEI 1S2 

R YSR++ F N++ VS+* 

Sbjct: 266 NRTYSRLFRGIGK3CAFNFENIKDKRVSQL 294 

Query- sid| 114859 | lan|dplORF03a Phage dpi ORF| 1350-1871 (3 
(173 letters) 

>sp|P44123|YB90_HAEIN HYPOTHETICAL PROTEIN HI1190 j-gi | 1074675 |pir| j F64021 hypothetical 
protein HI1190 - Haemophilus influenzae (strain Rd KW20) 
>gi | 1574117 (U32798) 6-pyruvoyl tetrahydrobiopterin 
synthase, putative [Haemophilus influenzae Rd] 
Length - 141 

Score « 100 bits (247), Expect • 6e-21 

Identities ■ 59/143 (41%), Positives » 83/143 (57%), Gaps - 10/143 (6%) 

Query: 2 RVSKTLTroAAHQLVGHFOKCANLHGHTYXVEISIA 60 

♦+SK +FD AH L GH GKC NLHGHTYK+++ ++G Y G* + MV+DF +K I 
SbjCt: 3 KI SKE FS FDMAHLLDG HDGKCQNLHGHT YKLQVE I SGD I> YXSGAKKAMVI D FSDLXS I VK 62 

Query: 61 GTFIDRLDHAVLL-QGNBP lALAMAVDTKRVLFGFRTTAENMSRFLTWTLTEXMWX 115 

4D +DHA ♦ Q MS L *++K FRTTAR ++RP+ L + 

Sbjct: 63 KVILDPMDHAFIYDQn^ESQIATLLQKLNSKT^^ 120 

Query: 116 HARIDS I KLVfETPTGCAECTYYE 13B 

I SI+LWBTPT + C Y E 
Sbjct: 121 QLSISSIRLWETPT--SFCEYQE 141 

Query- aid| 114860 }lan|dplORF039 Phage dpi ORF| 3306-3803 | 3 
(165 letters) 

>emb|CAA6B244| (X99978) ORF7 ; hydophobic protein {Lactobacillus plantarum] 
Length - 166 

Score ■ 64.4 bits (154), Expect - Se-10 

Identities » 49/156 (31%), Positives - 84/156 (53%), Gaps - 9/156 (5%) 

Query: 8 WLVRTALIAALYVT LTVAFSA I S Y - - G P I QFRVSEALI LL PLWNHRWT PO IVLGTI I ANP 65 

»+♦ AL+AA+YV L + +A S G IQFRVSE L L **H GIV G U + 
SbjCt: 9 WIIN-ALVAAMYVVLCMPAAFSIASGAIQFWSEGLKHLAVFN^ 67 

Query: 66 FSP-LGLIDVLFGSLATFLCXXXXXXXXXXXSPLYSLICPVLA NAYL I ALELRI VY 120 

F P L++VLFG + L ++ + +A +• ++IAL + +♦ 

Sbjct: 68 FGPGASIJiNVX«FGGGQSLLALLVLTWLAPKLKTVWQRMLLNIALFW 127 

Query: 121 s-lpfwbsviyvgiseaiivlisyflistlaknnhf 155 

S ♦ FW + + +SE 11+ 1+ ++ +L + HF 
Sbjct: 128 SXJVAFWPTYLTTALSELIIMSITAFIMYSLDRVLHF 163 

Query- sidl 114862 |lan|dplORF041 Phage dpi ORF(8208-8699|3 
(163 letters) 

>gi 1 2522313 IAF012906) dUTPase homolog I Bacillus subtil is) 
>gi|2634394|emb|CAB13B93| (299114) similar to 
deoxyuridine 5 1 -triphosphate nucleotidohydrolase 
[Bacillus subtilis) >gi| 3025643 (AF020713) putative 
dUTPaae [Bacteriophage SPBc2) 
Length » 142 

Score » 108 bits (267) , Expect « 2e-23 

Identities - 6S/160 (40%), Positives » 83/160 (51%), Gaps - 25/160 (1S%) 

Query; 5 VI!VKMIDPm)RUm- -GDVT^VRISSITKIDWJSADVSRCRKVLQKAQVYSVAAGECI 62 

♦ +K. +D R+ GDW+D+R ♦ ID* - 
SbjCt: 3 IKIKYl^ETQTRINKMEQGDWIDLRAAEDVAIKKDEFKL 41 - 

Query: 63 KIAHG PALELFtGGYEAI LHPRSSLFKKTGLI FVSS - GVI D EG YKGDTDEWFS VWYATRDft. 121 

+ G A+ELP+GYEA + PRSS +K G+I +S GVIDE YKGD D WF YA R0 
Sbjct: 42 -VPLGVAMELPEGYEAHVVPR5STTKNFGVIQTNSMGVIDESYKCTHDFWFPAYAIiRDT 100 
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Query: 122 DIFYDQR1AQFRIQEKQPAIKFTJFVESLGNAARGGHGSTG 161 

I RI OFRI +K PA+ V+ LGN RGGHGSTO 
Sbjct: 101 KIKKGDRICQFRIMKKMPAVDLIEVDRLGNGDRGGKGSTG 140 

Query- Bid) 114867) lan jdplORF046 Phage dpi ORFj 42774 -43202 1 3 
(142 lectera) 

>erab|CAB07984) (293946) hypothetical protein (bacteriophage Dp-l] 
Length . 142 

Score - 287 bita (728), Expect * 2e-77 

Identities - 142/142 (100%> . Positives - 142/142 (1001) 

Query: 1 MFMWLHDTAVLTTIITACSCr/LTVI^KLFEW^ 60 

MPMWLMDTAVLTT I ITACS0VLTVLLNKLFEWKSNKAK5VLEDISTTLSTLXQQVDGIDQ 
Sbjct: 1 MPNWL2JDTAVLTTI ZTACSGVLTVLLNKLFEHKSNKAXSVLED I STTLSTLKQQVDG IDQ 60 

Query: 61 TTVAXlffiQNDVIQDGTRKZQRYRLVlQLKREVITGYrTIiDHFREItSILFSSTKNI^GNGE 120 

TTVAINHQNDVIQDOTRXIQRVRLYHDLKRBVITGYTTL^^ 
Sbjct: 61 TTVAINHQNDVIQDGTRKIQRYRLYHDLKREVITGYTTIJDH^ 120 

Query: 121 VEALY2 KYXKLP I REEDLDETT 142 

VEALYEKYKKLPIREEDLDETI 
Sbjct: 121 VEALYEKYKKLPIREEDLDETI 142 

Query. sid| 114901 |lan|dplORP080 Phage dpi ORF| 42490-42759 | 1 
(89 letters) 

>emb|CABQ7983| (Z93946) hypothetical protein (bacteriophage Dp-l] 
Length • 124 

score - 147 bits (367), Expect - le-35 
Identities . 75/75 (100%), Positives * 75/75 (100%) 

Query: 1 MLNLTKSRQ I VAEFT IGQGAEKKLVKTT I VN I DANAVSTVSETLHD PD LYAAKRRELRAD 60 

MLNLTKSRQI VAEFTIGQGABKKLVKTTIVll IDAKAVSTVSETLHDFDLYAANRRELRAD 
Sbjct: 1 KLNLTKSHQZVAEFTIGQGAEKKLVKTTIVNICANAVSTVSETL^ 60 

Query: 61 EQKLRETRYAIKDEI 75 

EQKLRETR YAI BDBI 
Sbjct: 61 EQKLRBTRYAIEDEI 75 

Query. sid|ll4912|lan|dplORF091 Phage dpi ORF| 43189-43413 | 1 
(74 letters) 

>emb|CAB07985( (Z93946) holin (bacteriophage Dp-lJ 
Length - 74 

Score - 63.2 bits (151), Expect » 2e-10 
Identities - 34/74 (45%), Positives - 34/74 (45%) 

Query. 1 JJ££^£YDX> * * * «XXXXXXXXXXXXXXXXYQ^ VLGVSSR 

Sbjct: 1 MKLSIJEQYDVAitNVVTVVVPAAIALITGLGALYQPIT^ 60 

Query: 61 NYQKEQEAQNNEVE 74 

NYQKBQBAQKNSVE 
Sbjct: €1 NYQKEQSAQNNBVE 74 
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Condensed listing of homology information from above 



Phage: dpi 
Database: nr 
Program: Blastp 

Query- aidf 114922 | lan | dplORFOOl Phage dpi ORF | 36698-403 90 | 2 
(1230 letters) 

gi|2444124 (UB8974) ORF45 (Streptococcus thennophilus temperate ... 467 e-130 

gi|928828 (M4S93) ORF1904; putative [Lactococcus lactis phage B . . . 427 e-llfl 

gi|2935676 (AF032121) unknown [Streptococcus thermophilus bacter... 309 le-82 

gij 2935691 (AF032122) unknown [Streptococcus thermophilus bacter. . . 306 7e-82 

gi|3540289 (AF0S7033) putative anti-receptor [Streptococcus ther... 279 6e-74 

gi|4S30154|gb|AAD21894.l| (AF085222) putative tail-host specific... 220 3e-56 

gij 930045 |emb|CAA33387| (X15332) alpha-l (III) collagen [Homo sa. . . 58 4e-07 

gi j 1070603 |pir| (CGHU7L collagen alpha 1(111) chain precursor - h... S8 4e-07 

gi | 4502951 jref!NP_000081.l|PCOL3Al| collagen, type III. alpha 1 ... 58 4e-07 

gi j 115290 | sp | P04 258 |CA13_BOVTN COLLAGEN ALPHA 1( III) CHAIN >gi | 7 . . . 58 4e-07 

gij 575322 jemb|CAA36279| (X52046) type III collagen [Mus musculus] 57 8e-07 

gij 2119163 |pir| | S59856 collagen alpha Kill) chain precursor - ra. . . 57 8e-07 

gi | 543 912 | sp | P13941 | CA13 RAT COLLAGEN ALPHA l(III) CHAIN >gi|S4 3... 57 le-06 

gi| 3171998 |emb|CAA065l0| (AJ005395) collagen alpha l (III) [Ratt... 57 le-06 

gij 3947565 jerabjCAA50250j (Z49967) similar to collagen; cDNA EST ... 54 7e-06 

gi|423403|pir| JA46053 bullous pemphigoid antigen. BPAG2, type XV... 53 9e-06 

gij 115410 j sp| P12114 |CCS1_CAEEL CUTICLE COLLAGEN SQT-1 >gi j 84437 | .. . S3 9e-06 

gi|38?380l|emb|CAA90084| (249907) cuticle collagen SQT-1; CDNAE... 53 9e-06 

Query- sid| 114823 | lan|dplORF002 Phage dpi 0RF| 32386-35835 | 1 
(1149 letters) 

gi|3341922|dbj|BAA31888| (AB009866) or£ 15 [bacteriophage phi PVLJ 280 3e-74 

gij 4126622 jdbj |BAA36642.lj (AB016282) ORF36 [bacteriophage phi-105) 232 le-59 

gi j 1369948 |embjcAAS9194| (X84706) host interacting protein [Bact... 201 3e-50 

gi|3139112 (AF063097) gpT [Bacteriophage P2] 188 2e-46 

gij 3337272 (U32222) G protein (Bacteriophage 186] 161 3e-38 

gi(4063799| dbj | BAA36253 | (AB008S50) or£25; similar to T gene of ... 159 8e-38 

gij 3172274 (AF022214) minor tail subunit; putative tape-measure ... 123 6e-27 

gi I 465127 I sp| Q05233 I VG26_BPKL5 MINOR TAIL PROTEIN GP26 >gi|41904... 108 2e-22 

gi|3540284 (AF0S7O33) putative minor tail protein (Streptococcus... 99 2e-19 

gij2444119 (U88974) ORF40 (Streptococcus thermophilus temperate ... 90 6e-l7 

gi j 2634555 ( emb|CABl4053| (Z99115) yoml [Bacillus subtilis] >gi|3... 66 le-09 

gij 2392838 (AF011378) unknown (Bacteriophage ski) 64 5e-09 

gi|2764873|emb|CAA66557| (X97918) gene 18.1 [Bacteriophage SPP1] 62 3e-08 

gij 1353559 (U3 8906) ORF42 (Bacteriophage rlt] 61 6e-08 

gi)63084l|pir| (S39079 puff C-8 protein - fungus gnat (Rhynchosci . . . 55 2e-06 

gijl730865|spjP51731|YO27 BPHP1 HYPOTHETICAL 72.8 KD PROTEIN IN ... 53 8e-06 

gi j 224288 1 prf I 1 1101273J ORF 7 (Bacteriophage HPl] 53 le-OS 

Query- sid| 114824 | lan |dplORF003 Phage dpi ORF | 53538-55877 | 3 
(779 letters) 

gi| 118825 |sp|P00562|DPOl_ECOLI ONA POLYMERASE I (POL I) >gi|670S... 193 3e-48 

gi 1 2982102 | pdb| 1KFS j A Chain A. All-Oxygen Dna Complexed To The 3... 193 3e-48 

gij 229889 |pdb I lDPIj DNA Polymerase I (Klenow Fragment) (E.C.2 193 3e-48 

gij 1169402 |sp|P43741|DP01 HAEIN DNA POLYMERASE I (POL I> >gi|l07... 191 le-47 

gi|2688462 (AE001156) DNA polymerase I (polA) [Borrelia burgdorf . . . 190 3e-47 

gij 809180 |pdbjlKLN|A Escherichia coli 190 3e-47 

gij 1913934 1 emb| CAA72997 | (Y12328) DNA -directed DNA polymerase I ... 189 8e-47 

gi(409093S (AF028719) DNA polymerase type I (Rhodothermus sp. *i... 175 le-42 

gij473157l|gb|AAD28505.l|AF121780_l (AF121780) DNA polymerase I ... 174 2e-42 

gij 1633576 (057757) similar to proofreading 3'-S' exonuclease an... 173 4e-42 

gij 3322368 (AE00119S) DNA polymerase I (polA) [Treponema pallidum] 172 9e-42 

gijl006S9S|dbj |BAA10748| (D6400S) DNA polymerase I (Synechocyati . . . 171 2e-41 

gi j 585062 }sp j Q07700 I DP01_MYCTU DNA POLYMERASE I (POL I) >gi|4161... 163 5e-39 

gij4376908|gb|AADl8751| TaE001645) DNA Polymerase I [Chlamydia p. . . 157 2e-37 

gi j 1169403 j Sp j P46835 |DP01_MYCLE DNA POLYMERASE I (POL I) >gi1l07... 152 7e-36 

gij2145839jpir| jS72949 DNA polymerase I - Mycobacterium leprae >... 152 7e-36 

gi|l405438jereb|CAA671B4| (X9B575) DNA-dependent DNA polymerase (... 152 9e-36 

gi 12506365 1 sp|P80194|DPOl THECA DNA POLYMERASE I, THERMOSTABLE {... 147 2e-34 

gij3328929 (AE001322) DNA Polymerase I [Chlamydia trachomatis) 147 3e-34 
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gi|3913510|sp|O52225|DPOl_THEFI DNA POLYMERASE I. THERMOSTABLE (... 
gi 1 1205984 (U33 536) dna polymerase I (Bacillus stearothermophilus) 
gij 118827 |sp|P13252|0P01_STRPN DNA POLYMERASE I (POL I) >gij9802... 
gi|1942202 (pdb| 1JXE| Stoffel Fragtnenc Of Taq Dna Polymerase I 
gi}1943520 |pdbj 1KTQ| Dna Polymerase 

gi|l084022 jpir j | JX0359 DNA-directed DNA polymerase <EC 2.7.7.7) .., 
gi|50789l|dbj |BAA06775| (D32013) DNA Polymerase (Thermus aquaticus] 
gijll8828|sp|P19B21|DP01_THEAQ DMA POLYMERASE 1, THERMOSTABLE (T. . . 
gi|1706S02 | sp| P52028 |DP01_THETH DNA POLYMERASE I, THERMOSTABLE (... 
gi | 1097211 jprf | (2113329A DNA polymerase {Thermus aquaticus therm... 
gi|2098289|pdb|lTAU|A Chain A. Structure Of Dna Polymerase 

Query- sidj 114825 | lan |dplORF004 Phage dpi ORF|40401-42440|3 
(679 letters) 

gi| 1934761 |emb|CAB07981| (Z93946) hypothetical protein (bacterio. 
gij 3540290 (AF0S7033) putative minor structural protein (Strepto. 
gi|2444l25 (U88974) ORF46 (Streptococcus thermophilua temperate . 
gi ) 1934 7 62 | emb) CAB07982 | (Z93946) hypothetical protein (bacterio. 
gij4S30155|gb|AAD21895.l| (AF08S222) unknown [Streptococcus Cher, 
gi (2935677 (AF032121) unknown (Streptococcus thermophilua bacter. 
gij2935692 (AF032122) unknown (Streptococcus thermophilua bacter. 
gi) 1136289 (U42597) histidine kinase A [Dictyoatelium discoideumj 

Query- sidf 114827) lanfdplORF006 Phage dpi ORF | 45296-46987 | 2 
{563 letters) 

gi| 4377165 |gb|AAD18987| (ABO01666) SWI/SNF family helicase_2 (Ch. 
gij 1769947 |emb|CAA67095) (X98455) SNF (Bacillus cereus] 
gi}3329l63 (AE001341) SWF/ SNF family helicase (Chlamydia traehom. 
gi|4377l49|gb|AADi8973| (AE001664) SWI/SNF family helicase_l [Ch. 
gi|3328995 <AE001326) SWI/SNF family helicase (Chlamydia crachom. 
gi|24933S4lsp|P75093!Y018_MYCPN HYPOTHETICAL HELICASE MG018/MG01. 
gij 1653748 |dbj |BAA18659| To90916) helicase of the snf2/rad54 fam. 
gij 1763712 iembjcAB0S939l (Z833371 member of the SNF2 helicaae fa. 
gi|2636l53 |emb|CABl564S.l| (Z99122) similar to SNF2 helicase (Ba. 
gi|2909552)emb|CAAn284 ( (AL021924) helz (Mycobacterium tubercul . 
gi|3844627 (U39681) ATP-dependent RNA helicase, putative (Mycopl.. 
gi | 1351463 |sp|P47264|Y018_MYCGE HYPOTHETICAL HELICASE MG018 
gi|2660669 (AC002342) human Mi-2 autoantigen-like protein [Arabi., 
gi|1361537(pir| |I6420l helicase (motl) homolog - Mycoplasma geni. 
gi ( 3482977 (emb |CAA20533.1i (AL031369) putative protein (Arabidop., 
gij 3298562 {U91543) zinc-finger helicase (Homo sapiens] 
gij 3875971 lemb|CAB0249l[ (Z80344) similar to helicaae; cDNA EST 
gij4557451 jref |NP_001263.1|PCHD3 j chrcmodomain helicaae DNA bind, 
gij 26434 3S (AF007780) CKD3 (Drosophila melanogasterl 
gi|387S16S|emb|CAA91798j (Z67881) Similarity to Mouse Chromodoma. 

Query- sid| 114828 |lan|dplORF007 Phage dpi ORF|22230-2362l{3 
(463 letters) 

gi|2444105 <U88974) ORF26 (Streptococcus thermophilua temperate 
gij 3318666 (U19754) BBA31 homolog (Borrelia burgdorferi] 
gi|2690260 (AE000790) conserved hypothetical protein [Borrelia b.. 

Query- eid| 114829 | lan fdplORF008 Phage dpi ORF|49624-S0961|1 
(445 letters) 

gi)4406210)gb|AAD19901j (AF100420) DnaB replication fork helicas. 
gij3121983|8pj025916|DNAB_HELPY REPLICATIVE DNA HELICASE >gi|231.. 
gi | 4416322 |gb | AAD20314 | (AF106032) replicative helicase; DnaB [B. 
gij4l5589S (AB001551) REPLICATIVE DNA HELICASE [Helicobacter pyl. . 
gi|3322317 (AE001191) replicative DNA helicase (dnaB) [Treponema.. 
gi|l3803l|sp|P04530|VG41_BPT4 PRIMASE- HELICASE (PROTEIN GP41) >g. . 
gij 2983861 (AE000742) replicative DNA helicase (Aquifex aeolicusl 

Query- sid|114831|lan|dplORF010 Phage dpi ORF) 8699-9859 | 2 
(386 letters) 

gi | 2760912 (AF0372S8) RecA protein (Chlorobium tepidum] 
gi) 3219851 |sp|P94666|RECA_CLOPE RECA PROTEIN >gi| 1698591 (U61497... 
gi|l350566|sp|P48295jRECA_STRVL RECA PROTEIN >gi|S08860 (U04837) . . . 
gij 744163 |prf | (2014250A recA-like protein [Streptomyces violaceus] 
gi | 7304 87 | sp | P41054 | RECA STRAM RECA PROTEIN >gi ( 511133 | embj CAA82 . . . 
gi| 2687334 | emb|CAAlS87Sr<AL020958) RecA protein (Streptomyces c . 
gi|l3S0S65|sp|P4B2 94|RECA_STRLI RECA PROTEIN >gi | 4 81482 |pir | | S38 . . . 
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gi (464599 j sp| P33542 | RECA_AQUPY RECA PROTEIN >gi j 1086167 |pir j |A55 . . 
gi|417636jsp|P32725|RECA_RHOSH RECA PROTEIN >gi | 541307 j pir | j S4 IS . . 
gi |2984348 (AE000775) recombination protein RecA [Aquifex aeolicus 
gi {3219854 | sp| P9S846 | RECA_STRRM RECA PROTEIN >gi ) 1729800 j emb | CAA. . 
gi| 2S00086|8p|Q59560|RECA_MYCSM RECA PROTEIN >gi | 1430892 j emb j CAA. . 
gi|13S0567|ap|P48296|RECA_THEAQ RECA PROTEIN >gi | 1072963 | pir j | A5 . . 
gi |625663 |pir j | JX0292 recA protein - Thermus aquaticus (strain HB8 
gi|1172B80|sp|P42440|RECA_CAMJE RECA PROTEIN >gi | 2119991 | pir | j 14 . . 
gi 1 4154654 (AE001453) RECA PROTEIN. [Helicobacter pylori J99] 
gi | 1072968 | pir | 1C55020 recA protein - Thermus sp >gi | 4S8472 |dbj | . . 
gi|3219652|splP95469|RECA_PARDE RECA PROTEIN >gi|182S468 (US9631.. 
gi | 2507284 j sp j P424 45 | RECA KELPY RECA PROTEIN >gi j 2313235 | gbjAADO . . 
gij 1172990 jsp|QQ2350|RECA~STAAU RECA PROTEIN >gi|46328S (L2S893).. 
gi j4416209jgb|AAD2026l| (AF094756) RecA protein [Bifidobacterium., 
gi j 2500084 j Sp j Q59180 | RECA_BORBU RECA PROTEIN >gi|l276443 (U23457.. 

Query- sidj 114832 | lan|dplORF011 Phage dpi ORFj 28017-29096 | 3 
[359 letters) 

gi|2444110 (U88974) ORF31 [Streptococcus thermophilus temperate .. 
gi{3320436 (AF0S7033) gp348 [Streptococcus thermophilus bacterio. . 
gi(479S14 |pir| |S34244 hypothetical protein p38 - actinophage vwb. . 

Query- sid| 114 834 | lan |dplORF013 Phage dpi ORF| 10215-11240 | 3 
(341 letters) 



gi 
gi 
gi 
gi 
gi 
gi 
gi 
gi 
gi 
gi 
gi 
gi 
gi 
gi 
gi 
gi 
gi 
gi 



S80855|emb|CAA29958| (X06803) dnaZX-lixe ORP put. DNA polymer. 
118807 jsp|P09122jDP3X_BACSU DNA POLYMERASE III SUBUNITS GAMMA. 
98292|pirj JS13786 DNA-direcced DNA polymerase (EC 2.7.7.7) II. 
1527142 (U66040) DNA polymerase III gamma subunit (Salmonella. 
2494197 1 spj P74876 |DP3X SALTY DNA POLYMERASE III SUBUNITS GAMM. 
118808 | sp| P06710 | DP3X ECO LI DNA POLYMERASE III SUBUNITS GAMMA. 
4155207 (AE001497) DNA POLYMERASE III SUBUNITS GAMMA AND TAU . 
23i3S41|gb|AAD07767.i| (AB000S84) DNA polymerase III gamma an. 
2 583049 (AF025391) DNA polymerase III holoenzyme tau subunit . 
2984127 (AE0007S9) DNA polymerase III gamma subunit (Aquifex . 
3861390|emb|CAA15289| (AJ235273) DNA POLYMERASE III SUBUNITS . 
1169397) sp | P43746 |DP3X_HAEIN DNA POLYMERASE III SUBUNITS GAMM. 
1293572 (U49738) DNA polymerase III tau homolog DnaX (Cauloba. 
3328753 (AE001306) DNA Pol III Gamma and Tau (Chlamydia trach. 
4 376294 | gb| AAD18193 | (AB001589) DNA .Polymerase III Gamma and . 
5B1255|embjCAA28l75| (X04487) alternate dnaZX protein (AA 1-6. 
2688379 (AS0O1151) DNA polymerase III, subunit s gamma and tau. 
3323329 (AE001268) DNA polymerase III, subunits gamma and tau. 



Query* sid|H4 835|lan|dplORF0l4 Phage dpi 0RF| S0961- 51974 1 3 
(337 letters) 

gi | 1346796 |sp|P47492|PRIM_MYCGB DNA PRIMA5E >gi 1 1361496 |pir| | F64 . . 
gi | 740008 |prf | | 2004290A primase (Haemophilus influenzae] 
gi j 1172619 1 sp j Q08346 j PRIM HAS IN DNA PRIMASE >gi) 1074033 jpir| |A64 . . 
gi j 1709769 j sp j Q04505 j PRIM_LACLA DNA PRIMASE >gi | 1075726 jpirj j JC2 . . 
gi j 639846 |dbj |BAA03 516 | (D146 90) DNA primase [Lactococcus lactis) 
Query- sid| 114B37 | lan|dplORF016 Phage dpi 0RF| 43413-44303 ) 3 
(296 letters) 

gi|1934766|emb|CAB079B6| (Z93946) N-acetylmuramoyl-L-alanine ami., 
gij 113676 |sp|P066S3|ALYS_STRPN AUTOLYS IN (N-ACETYLMURAMOYL-L-ALA. . 
gi j 282326 jpir| |A42935 N-acetylmuramoyl-L- alanine amidase (EC 3.5.. 
gi|416618|8p|P32762|ALYS_BFHB3 LYTIC AMIDASE (N-ACETYLMURAMOYL-L. . 
gij 285273 jpirj | A42936 N-acetylmuramcy 1 - L- alanine amidase - (EC 3.5.. 
gi j 127787 j sp | P15057 1 LYCA_BPCP1 LYS0ZYM8 (ENDOLYSIN) (MURAMIDASE) . . 
gi J 67761 j pir j jMUBPCP N-acetylraurarooyl-L-alanine amidase (EC 3.5... 
gi ( 127789 1 Spj P19386 | LYCA_BPCP9 LYSOZYMB (ENDOLYSIN) (MURAMIDASE).. 
gi|928832 (L44593) ORF259; putative [Lactococcus lactis phage BR. . 
gi|251170S|embjCAA7l783| (Y1081B) sigA binding protein [Streptoc. . 
gij 4097980 (U72655) surface protein C [Streptococcus pneumoniae) 
gij 2351768 (U89711) PspA [Streptococcus pneumoniae) 
gij2425109 (AF019904) choline binding protein A (Streptococcus- p. 
gi|28233Sjpirj |A41971 surface protein pspA precursor - Streptoco. 
gi|2 57633l|embjCAA05158| (AJ002054) SpsA protein (Streptococcus . 
gi|2127295jpir| JS57962 cspC protein - Clostridium acetobutylicum. 
gij 2576333 j emb |CAA05159| (AJ002055) SpsA protein (Streptococcus . 
gij4106522jgbjAAD02874.lt (AF097909) excreted protein FibB [Pept. 
gij 1361406 j pir | |SS7714 cspB protein - Clostridium acetobutylicum. 
gij 1914872 j emb j CAB04758J (Z82001) PCPA [Streptococcus pneumoniae] 
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gi 13168594) dbj |BAA28613| (AB012763) SpaA (Erysipelothrix rhusiop. 
gi|22927S0|embjcAA64942 j (X95646) homology to orf259 of laetoeoc. 
gi 1 2935696 (AF032122) putative lysin [Streptococcus thermophilus . 
gij 4586910) dbj |BAA76540.l| (AB017447) protective antigen SpaA.l . 
gij 3540294 (AF0S7033) lyain [Streptococcus thermophilu3 bacterio. 

Query* 3id|U4841 ( lan|dplORF020 Phage dpi 0RF| 1864-26S8| 1 
(264 letters) 

gij 2633745 |emb|CAB13247| (Z99111) similar to coenzyme PQQ synthe... 
gij 2808502 | emb| CAA12532 [ (AJ225561) ExsD protein (Sinorhizobium 
gi|38611Sl|embjCAA1505lj (AJ235272) unknown [Rickettsia prowazekii] 
gij 1652793 1 dbj |BAA17712} (D90908) hypothetical protein (Synechoc... 
gi j 1723815 j Sp] P55139 1 YGCF_ECQLI HYPOTHETICAL 25.0 KD PROTEIN IN . . . 
gij 2984272 (AE000769) hypothetical protein (Aquifex aeolicua] 
gi | 4155435 (AE001S16) putative [Helicobacter pylori J99] 
gi |2127833 |pirj |C64505 coenzyme PQQ synthesis protein III homolo. 
gij 2622338 (AE000890) coenzyme PQQ synthesis protein III (Methan. 
gi|32S7042(dbj |BAA29725| (AP000003) 2S4aa long hypothetical prot. 
gi| 2314068 igb|AAD07976.1| (AE000602) conserved hypothetical prot. 
gij 1723816 (sp|P4S097jYGCF_HAEIN HYPOTHETICAL PROTEIN HI11B9 >gi | . 

Query- sid| 114842 | lan|dplORF021 Phage dpi ORF| 2504-329S| 2 
(263 letters) 

gi|l2748l|sp|P19465|GCHl BACSU GTP CYCLOHYDROLASR I (GTP-CH-I) >. 
gij 3242315 |emb|CAA04237|~(AJ0006B5) GTP cyclohydrolase IStreptoc. 
gi| 2494695 |sp|Q54769jGCHl_SYNP7 GTP CYCLOHYDROLASE I (GTP-CH-I) . 
gij 25S06l|bbs| 112832 (S44049) GTP cyclohydrolase I {clone hGCH-1. 
gi) 4503949 | ref |NP_0001S2 . 1 | PGCH1 j GTP cyclohydrolase 1 (dopa-rea. 
gi | 2113967 ]emb|CAB08935| (Z95S57) folE [Mycobacterium tuberculosis! 
gi| 1730240 |sp|P5014l|GCHl CHICK GTP CYCLOHYDROLASE I (GTP-CH-I) .. 
gij 2494696 |sp|QS57S9jGCHl SYNY3 GTP CYCLOHYDROLASE I (GTP-CH-I) .. 
gij 121061 jsp|P22288|GCHl RAT GTP CYCLOHYDROLASE I PRECURSOR (GTP.. 
gi 1 3183 014 | Sp 1 013 774 | GCH1_SCHP0 GTP CYCLOHYDROLASE I (GTP-CH-I) .. 
gi|3097224|emb|CAA18795j (AL023093) GTP cyclohydrolase I (Mycoba. . 
gi|2494697|sp|Q199B0|GCHl CAEEL PROBABLE GTP CYCLOHYDROLASE I (G . . 
gi|462167isp|Q059IS|GCHl_MOUSE GTP CYCLOHYDROLASE I PRECURSOR (G. . 
gi|l669664|emb|CAA89808|~(Z497O6) GTP cyclohydrolase 1 [Dictyost.. 
gi | 2981082 (AP052048) GTP- cyclohydrolase (Ostertagia ostertagi] 
gij 31954 jembJCAA78908| (Z16418) GTP cyclohydrolase I [Homosapi.. 
gij5S1344jbbaj 150280 (S71373) GTP cyclohydrolase I [mice. Peptid.. 
gi |1730247 | sp I PS1601 | GCH1_YEAST GTP CYCLOHYDROLASE I (GTP-CH-I) 
gi|1246912|emb|CAA8739?| (Z47201) GTP cyclohydrolase 1 fSaccharo. . 
gi 1 1730246 | sp | P51S95 | GCH1_STRPN GTP CYCLOHYDROLASE I (GTP-CH-I) .. 
gij 2982951 (AE000680) GTP cyclohydrolase I (Aquifex aeolicus) 

Query- aid 1 114 843 1 lan |dplORP02 2 Phage dpi ORF| 30B96-3167S | 2 
(259 letters) 

gij 2347102 (U77367) internalin [Listeria monocytogenes J 

gi| 3123226 |sp|P2S146|lNLA_LISMO INTERNALIN A PRECURSOR >gi)48705... 

gij 149674 (M67471) internalin [Listeria monocytogenes] 

Query sid|H4850|lan{dplORP029 Phage dpi ORF|6«2-1348| 2 
(228 letters) 

gij 2650185 (AE0O1074) auccinoglycan biosynthesis regulator (exsB... 
gi j 3861231 |«nb|CAA15131j (AJ23S272) unknown (Rickettsia prowazekii) 
gi|2622210 (AE000B81) conserved protein (Methanobacterium thermo. . . 
gi| 2983380 (AE000709) trans- regulatory protein ExsB [Aquifex aeo... 
gi|l001327|dbj|BAA10814| (D64006) ExsB (Synechocystis sp.] 
gij 2128055 jpirj |B6446B hypothetical protein homolog MJ1347 - Met... 
gij 4155143 (AE001491) putative (Helicobacter pylori J99) 
gij 2313760 1 gbjAAD07701.il (AEO0OS78) conserved hypothetical prot .. . 
gi|2120814 (pir j |S60183 protein SxsS - Rhizobium meHloti >gi(ll4. ■ . 
gij 2633743 jembjcASl3245| (Z99111) similar to hypothetical protei. . . 
gij 1175543 jsp|P44 124 |YBAX HAEIN HYPOTHETICAL PROTEIN HI1191 >gi) . . . 
gi | 24 9S537 | sp | P77756 1 YBAX^ECOLI HYPOTHETICAL 25 . 5 KD PROTEIN IN ... 
gij 3256471 1 dbj |BAA29154 .lj (AP0O0O01) 269aa long hypothetical pr. . . 
gi)292H56 (AF022216) aluminum resistance protein (Arthrobacter ... 

Query- sid| 11485S | lan|dplORP034 Phage dpi ORPj 131-652 1 2 
{173 letters) 
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(AE001S54) putative {Helicobacter pylori J991 
9b| AAD08456 . 1 | (AE000642) conserved hypothetical prot . . . 
(AE000714) hypothetical protein (Aquifex aeolicus] 
dbj |BAA10757{ (D64005) hypothetical protein (Synechoc. . . 
(U11045) unknown iBuchnera aphidicolal 

3p|Q46920|YQCD ECOLI HYPOTHETICAL 32.6 KD PROTEIN IN . . . 
sp|P441S3|YQCD~HAEIN HYPOTHETICAL PROTEIN HI1291 >gi| . . . 
emb|CAA14543l 7AJ235270) unknown (Rickettsia prowasekii) 



gi|4155926 
gi|23l4S88 
gi (2983459 
gi 1 1006604 
gi|2967529 
gi|24956S4, 
gi|1175604i 
gi|3860642 

Query- sid| 114857 |lan|dplORF036 Phage dpi ORF| 48808-49362 | 1 
(184 letters) 
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gi (1353529 (U38906) ORP12 [Bacteriophage rlt) 53 le-06 

Query- sid| 114859 1 lan| dplORF038 Phage dpi 0RF| 1350-1871 1 3 
(173 letters) 

gi 1 1175542 | sp[ P44123 | YB90_HAEIN HYPOTHETICAL PROTEIN HI1190 >gi| . . . 100 6e-21 

gi | 2982977 (AE000681) hypothetical protein [Aquifex aeolicus] 67 7e-ll 

gi| 3860744 |en\b|CAA1464S| (AJ235270) unknown (Rickettsia prowa2ekii] 65 3e-l0 

gi | 2650193 (AE001074) conserved hypothetical protein (Archaeoglo. . . 58 4e-08 

gi|32S8383|dbj |BAA3106S.1| (AP000007) 157aa long hypothetical pr .. . 55 2e-07 

gij 1001713 | dbj |BAA10550| (D64004) hypothetical protein (Synechoc... 5.0 Be-06 

gi | 4155434 (AE001516) putative (Helicobacter pylori J99J 50 le-05 

Query- aid) 114860 j lan |dplORF039 Phage dpi ORF|3306-3803 | 3 
(165 letters) 

gi|1922884!emb|CAA68244| (X99978) ORF7; hydophobic protein (Lact... 64 Se-10 

Query- sid| 114862 | lan|dplORF041 Phage dpi ORFf 8208-8699(3 
(163 letters) 

gi(2522313 (AF012906) dUTPase homolog (Bacillus subtilis) >gi|26... 108 2e-23 

gi| 2634150 |emb|CAB136S0| (Z99113) similar to deoxyuridine 5 ■ -tri .. . 108 3e-23 

gi | 3 913546 j spj 054134 | DUT_STRCO DEOXYURIDINE S ' -TRIPHOSPHATE NUCL... 56 2e-07 

gi (3913542 j sp|O48500 (DUT'sPTS DEOXYURIDINE 5 ' -TRIPHOSPHATE NUCLE. . ■ 52 3e-06 

gi|3913S4a|8p|068992|DUT~CHLTE DEOXYURIDINE 5 1 -TRIPHOSPHATE NUCL... 50 le-05 

Query- sid(l!4867 j lan|dplORF046 Phage dpi ORF| 42774-43202(3 
(142 letters) 

gi| 1934764 |embjCAB0798«| (Z93946) hypothetical protein (bacterio... 287 2e-77 

Query, sid| 114901 ( lan | dplORFO 80 Phage dpi ORF | 42490-42759 |l 
(89 letters) 



gi|1934763|emb|CAB07983( (293946) hypothetical protein Ibacterio. 

Query- sid | 114912 | lan |dplORF091 Phage dpi ORP|431B9-43413 | 1 
(74 letters) 



147 ie-3S 



gi(1934765|etnb|CAB07985| (Z93946) holih [bacteriophage Dp-l) 



63 



2e-l0 
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Table 32 



Sequence of Dpi 



published by Sheehan and al.. 4731 nucleotides. 



i 

71 

141 

211 

281 

3S1 

421 

491 

S61 

631 

701 

771 

641 

911 

931 

1051 

1121 

1191 

1261 

1331 

1401 

1471 

1S41 

1611 

1681 

1751 

1821 

1891 

1961 

2031 

2101 

2171 

2241 

2311 

2381 

2451 

2521 

2591 

2661 

2731 

2801 

2871 

2941 

3011 

3081 

3151 

3221 

3291 

3361 

3431 

3501 

3571 

3641 

3711 

3781 

3851 

3921 

3991 

4061 

4131 

4201 

4271 

4341 

4411 

4481 

4 551 

4621 

4691 



tttaaatctt ttgacaaagc caattcaaat tgtaccgctg aagcaatttt ccatgtatcc actcaaagtt 
gttcagtgtg gctcaatcat attaaaatcg aacttggtaa tatctctact ccttttagtg aagcagagga 
agaccttaaa tatcgaattg acccaaaagc cgatcaaaag ctaactaacc aacagttgac ggcactcacg 
gaaaaggctc aactacatga cgcagaaccg aaagccaagg ctacaatgga geagttaagt aacttagaaa 
aggcttatga aggtagaatg aaagctaatg aagaagctat caacaaatcg gaacccgacc taatcttagc 
ggcaagtcga attgaagcta ctatccaaga acttggcggg ccacgggaac tgaagaagtt cgtcgacagt 
tgcatgagct cttctaatca aggtctaact atcggtaaga acgacggtag ctctaccact aaggtatcaa 
gtgaccgaat ttctatgttc tccgcaggga atgaagttat gtaccttacg caagggttca ttcacatcga 
caacgggatc ttcacccaat ccattcaagc cggccgacttt agaacggaac aacactcgtt taacccagac 
acgaacgtga ttcggcatgt aggataagga gaataacatg acaaaattta tcaactcaca cggccctctt 
cacttgaacc tttacgtcga acaagtcagt caggacgtaa cgaacaacte cccgcgagtt agttggcgag 
ctactgtcga ccgcgacgga gcttatcgaa cgtggactta tggaaatatt agtaaccttt ccgcatggtt 
aaatggttca agcgctcata gcagtcaccc agactacgac acgcccggcg aagaggtaac gcccgcaagt 
ggagaagcga ctgttcctca caacagtgac gggacaaaga caatgtccgc ttgggctccg tttgacccta 
acaacggcgc tcacggaaat accactatct ctaccaatta cactttagac agtattccaa ggtctacaca 
gattcccagt tttgagggaa atcgaaacct aggatcttta catacggtta tcttcaaccg aaaagcgaac 
tcttttacgc atcaagtteg gtacegagtt ttcggtagcg actggataga tttaggtaag aaccatacta 
ctagcgcacc ctttacgccg tcactggact cagcaaggta cttacctaaa tcaagctccg gaacaatgga 
catctgtatt cgaacctata acggaactac gcaaattggt agtgacgtct attcaaacgg acggaggttc 
aacatccccg attcagtacg tcctactttt tcgggcattt ctttagtaga cacgacttca gcggttcgac 
agattttaac agggaacaac ttcctccaaa tcatgtcgaa cattcaagtc aacttcaaca acgcctccgg 
cgcttacgga tccactatcc aagcactcca cgctgagctc gtaggtaaaa accaagctac caacgaaaac 
ggcggcaaat tgggtatgat gaactttaat ggctccgcca ccgtaagagc atgggttaca gacacgcgag 
gaaaacaatc gaacgtccaa gacgtaccta tcaatgttat agaatactat ggaccgtcta tcaatttctc 
cgctcaacgt actcgtcaaa atcccgcaat tacccaagcc cttcgaaatg ctaaggtcgc acctataacg 
gtaggaggtc aacagaaaaa catcatgcaa attaccttct ccgtggcgcc gttgaacact accaatttca 
cagaagatag aggttcggcg tcagggacgc tcactaetat ttccctactg accaactcgt ccgcgaactt 
agctggtaac tacgggccgg acaagtctta catagttaag gctaaaatcc aagacaggtt cacttcgact 
gaatttagtg ctacggtacc taccgaatca gtagttctta accatgacaa ggacggtcga cttggagttg 
gtaaggttge agaacaaggg aaggcagggc caategatgc agcaggcgat atatacgctg gaggtcgaca 
agttcaacag tttcagctca ctgataataa tggagcattg aacaggggtc aatataacga tgttggaata 
agcgtgaaac agagtttaca tggcgaagta acaaatacga ggacaaccct acgggaactc gaggtgaatg 
gggactattt caaaatttct ggttagatag ctggaaaatg gttcaaccct tcattacaat gtcaggaaga 
atgttcatca ggacagcgaa cgatggaaac agctggagac ctaacaagtg gaaagaggtc etatttaagc 
aagacttcga acagaataat cggcagaaac ttgttcttca aagtgggtgg aaccatcact caacctatgg 
cgacgcattc cattcgaaaa ctcttgacgg catagtatat ttgagaggaa atgtgcataa aggacttatc 
gacaaagagg ctactattgc agtacttcct gaaggactta gaccgaaagt ttcaatgtat cttcaggctc 
tcaataaccc atatggaaat gccattctat gtatacacac cgacggaaga cttgtggtga aatcgaatgt 
agataattcC tggtcaaatt tagacaatgt ctcatttcgt atttaatttg agctgaaacc atgttataat 
attttttaga aaggaggtga gaactatgtt gaacctcaca aaatcgcgcc aaattgtggc agagttcact 
attggacaag gagctgaaaa gaaacttgtc aaaacaacga etgtgaacat tgatgcaaac gcagtatcaa 
ccgtctctga aactcttcat gacccagact tgtacgctgc gaaccgtcga gaactecgag ctgacgagca 
aaaacctegc gaaactcgtt acgcaatcga agatgaaact aatagctgga gcgggggaaa aaagggggag 
cccggctcta acaggctgaa taaggaggcg ecaatctatg ccaatgtggc taaacgaeac cgcagtcttg 
acgacgatta ttacagcgtg cagcggagtg cctactgtcc tactaaataa gttattcgaa cggaaaccga 
acaaagccaa gagcgtttta gaggatatcc ctacaactct tagcactctt aaacagcagg ccgacgggat 
tgaccaaaeg acagtagcaa ccaatcacca aaatgacgtc attcaagacg gaactagaaa aattcaacgt 
taccgtcttt atcacgactt aaaaagggaa gtgataacag gctatacaac tctcgaccat tttagagagc 
tctctatttt attcgaaagt tacaagaacc ttggcggaaa cggtgaagtt gaagcctcgt atgaaaaata 
caagaaatca ccaattaggg aggaagattt agatgaaact atctaacgaa caacatgacg eagcaaagaa 
cgtggtaacc gtagtegttc cagcagcgat tgcactaatt acaggtcttg gagcgttgta tcaatttgac 
actactgcta tcacaggaac cattgcacct ctcgcaactt ttgcaggtac tgttctagga gtttctagcc 
gaaactacca aaaggaacaa gaagctcaaa acaatgaggt ggaataacgg gagtcgatat tgaaaaaggc 
gttgcgtgga tgcaggcccg aaagggtcga gtaccttata gcatggactt tcgagacggt cctgatagct 
atgactgctc aagttctatg tactacgctc tccgctcagc cggagcttca agtgctggat gggcagtcaa 
tactgagtac acgcacgcat ggcttattga aaacggttat gaactaatta gtgaaaatgc cccgtgggat 
gctaaacgag gcgacatctt catctgggga cgcaaaggtg ctagcgcagg cgctggaggt eatacaggga 
cgttcattga cage gat aac atcattcact gcaactacgc etacgaegga atttcegtea acgaccacga 
tgagcgttgg tactatgeag gtcaacctta ctactacgcc tategcttga etaaegcaaa tgctcaaccg 
gctgagaaga aacttggctg gcagaaagat gctactggtt tetggtaege tcgagcaaac ggaacttatc 
caaaagatga gttcgagtae atcgaagaaa acaagtcttg gccctacttt gacgaccaag getacatget 
cgctgagaaa tggttgaaac atactgatgg aaattggtat tggttcgacc gtgaeggata catggctaeg ... " ■ 
tcatggaaac ggattggcga gtcatggtac tacttcaatc gcgatggttc aatggtaace ggttggatta"- 
agtattacga taattggtac cattgegatg ctaccaacgg cgacacgaaa ecgaatgcgt ttatccgtta 
taacgaegge tggtatccac tattacegga eggaegtctg gcagacaaac ctcaattcac cgtagagccg 
gaegggctea ttactgctaa agctcaaaat atagagagga ggaagctctt ttcttaatat tgtttctctt 
aatcccgcaa ggtttcgacc ctgcggggtt tatgegtege gaattactct atttacttat tcgaagattt 
caattataat taaataatca acgagattca caateggagg aatg 
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Table 33 



Streptococcus accession numbers 
gi|5776553|gb|AF026471.2|AF026471 [5776553] 

gi|54 1 0470|gb|AF 1 39890. 1 |AF 1 39890 [54 10470] 

gi|5410468|gb|AF139889.1|AF139889 [5410468] 

gi|54 10466|gb|AF 1 39888. 1 |AF ! 39888 [54 1 0466] 

gi|54 1 0464|gb|AF 1 39887. 1 |AF 1 39887 [54 10464] 

gi|5410462|gblAF139886.1iAF139886 [5410462] 

gi|54l0460|gb|AF139885.1[AF139885 [5410460] 

gi|54 10458|gb| AF 1 39884. 1 [AF 1 39884 [54 10458] 

gi|5410456|gb|AF139883.I|AF139883 [5410456] 

gi|3093394|cmb|AJ005697.1|SPN5697 [3093394] 

gi|5759208|gb|AF171873.1|AF171873 [5759208] 

gi|57583 1 1 |gb|AF 1 62664. 1 |AF 1 62664 [57583 1 1 ] 

gil5739313|gb!AFt61701.1|AF161701 [5739313] 

gi|5739310|gb|AF161700.1|AF161700 [5739310] 

gi|5726354|gb|AF159448.1|AF159448 [5726354] 

gi|5726290|gb(AF127143.1|AF127143 [5726290] 

gi|5712666|gb|AF140784.11AF140784 [5712666] 

gi|4218525|emb|AJ009639.1 |SPAJ9639 [4218525] 

gi|5616524|gblAF169483.1|AF169483 [5616524] 

gi!5579395|gb|AF162656.1|AF162656 [5579395] 

gi|5579393|gb|AF162655.1|AF162655 [5579393] 

gi[5578890|emb|AJl31985.l|SPN131985 
[5578890] 

gi|5566442|gb|AF167442.1|AF167442 [5566442] 

gi|5459332|emb|AJ243540.1|EVE243540 
[5459332] 

gi|5305398|gb|AF07281 1. 1|AF0728U [5305398] 

gi|529592i|emb[AJ242698.1|SPN242698 
[5295921] 

gi|5295920|crab|AJ242697.1 |SPN242697 
[5295920] 

gi|5295919|cmb|AJ242696.1|SPN242696 
[5295919] 

gi}52959 1 8|emb|AJ242695. 1 |SPN242695 
[5295918] 

gi|4583522|gb|AF140356.1|AF140356 [4583522] 
gi(5231206|gb|AF157826.1|AF157826 [5231206] 
gi|523 1 203|gb|AFl 57825. 1 )AF 1 57825 [523 1 203] 



gi|523 1 200|gb|AF 1 57824. 1 1 AF 1 57824 [5231200] 

gi|5231197|gb|AF157823.1|AF157823 [5231197] 

gi|523 1 1941gb|AF 1 57822. 1 |AF1 57822 [523 11 94] 

gi|523 1191 |gb|AF 1 5782 1. 1 J AF 157821 [5231191] 

gi|523 1 1 88|gbf AF 1 57820. 1 1 AF 1 57 820 [523 1 1 881 

gi|5231185|gb|AF157819.1|AF157819 [5231185] 

gi|52311821gb|AF157818.1|AF157818 [5231182] 

gi|5231179|gb|AF157817.1|AF157817 [5231179] 

gil4336851|gb|AF106138.1|AF106138 [4336851] 

gi|4336848|gb|AF106 1 37. 1 |AF1 06 1 37 [4336848] 

gi|4336845|gblAF106136.1|AF106136 [4336845] 

gil4336842|gb|AF106135.1|AF106135 [4336842] 

gi|4336839|gb|AF106134.1|AF106l34 [4336839] 

gi|4336836[gblAF106133.1|AF106l33 [4336836] 

gi|4336833|gb!AF106132.1|AF106132 [4336833] 

gi|39075971gblAF094575.1|AF094575 [3907597] 

gii5030425|gb(AF061748.2|AF061748 [5030425] 

gi|490288 1 |emb(AJ239004. 1 |SPN239004 
[4902881] 

gil5001710|gb|AF112358.1|AF112358 [5001710] 

gi|5001690|gb|AF106539.1|AF106539 [5001690] 

gij4973271|gb|AF144420.1|AF144420 [4973271] 

gi|4973269|gb|AF144419.1|AF144419 [4973269] 

gi|4973267|gb|AF144418.1|AF144418 [4973267] 

gi|4928190|gb|AF129757.1[AF129757 [4928190] 

gi|4927743|gb|AF126061.1|AF126061 [4927743] 

gi|4927742jgb(AF126060.1|AF126060 [4927742] 

gi|492774 1 |gb|AF 1 26059. 1 |AF 1 26059 [492774 1 ] 

gi|4495247|emb|AJ240675. 1 |SPN240675 
[4495247] 

gi|4495245|emb|AJ240670.1|SPN240670 
[4495245] 

gi|4495243|emb|AJ240669. 1 |SPN240669 
[4495243] . 

gi|449524 1 |emb(AJ240668 . 1 ISPN240668 
[4495241] 

gi|4495239|cmb| AJ240667. 1 |SPN240667 
[4495239] 
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gi|4495237|emblAJ240666. 1 |SPN240666 
[4495237] 

gi|4495235|emb|AJ240665.1|SPN240665 
[4495235] 

gi|4495233|emb!AJ240664. 1 |SPN240664 
[4495233] 

gi|449523 1 jembl AJ240663 . 1 (SPN240663 
[4495231J 

gi|4495229|emb|AJ240662.1|SPN240662 
[4495229] 

gi|4495227|emb| A J24066 1 . 1 |SPN24066 1 
[4495227] 

gi|4495225|emb|A J240660. 1 JSPN240660 
[4495225] 

gi|4495223|cmb|AJ240659.1|SPN240659 
[4495223] 

gi|4495221 |emb| AJ240658. 1 |SPN240658 
[4495221] 

gij44952 19|emb|AJ240657. 1 [SPN240657 
[4495219] 

gi|4495217|emb|AJ240656. 1 |SPN240656 
[4495217] 

gil4495215|emb|AJ240655.1|SPN240655 
[4495215] 

gi|44952 13|emb|AJ240654. 1 |SPN240654 
[4495213] 

gi|4495211|cmb|AJ240653.1|SPN240653 
[4495211] 

gi|4495209|emblAJ240652.1|SPN240652 
[4495209] 

gi|4495207|cmb| AJ24065 1 . 1 JSPN24065 1 
[4495207] 

gi|4495205|emb|A J240650. 1 |SPN240650 
[4495205] 

gi|4495203|emb|AJ240649.1|SPN240649 
[4495203] 

gi|4495201 jemb|AJ240648. 1 |SPN240648 
[4495201] 

gi|4495199|emb|AJ240647.1|SPN240647 
[4495199] 

gi|4495 ! 97|emb|AJ240644. 1 |SPN240644 
[4495197] 

gi|4495195|emb|AJ240643.1jSPN240643 
[4495195] 

gi|4495193|cmb|AJ240642.1|SPN240642 
[4495193] 

gi|4495 1 9 1 |cmb|AJ24064 1 . 1 |SPN24064 1 
[4495191] 



gi|4495 1 89|emb| AJ240640. 1 [SPN240640 
[4495189] 

gi|4495 1 87|emb|AJ240639. l|SPN240639 
[4495187] 

gi|4495185|emb|AJ240638.1|SPN240638 
[4495185] 

gi|4495183|emb|AJ240637.1|SPN240637 
[4495183] 

gi|4495 1 8 1 |cmb|AJ24063 6. 1 ISPN240636 
[4495181] 

gi|4495 1 79|cmb|AJ240635. 1 |SPN240635 
[4495179] 

gi|4495 177|emb|AJ240634. 1 |SPN240634 
[4495177] 

gi|4495 1 75|emb|AJ240633. 1 |SPN240633 
[4495175] 

gi|4495 1 73|emb|AJ240630. HSPN240630 
[4495173] 

gi|4495171|emb|AJ240629.1|SPN240629 
[4495171] 

gi!4495169|emb|AJ240628.1|SPN240628 
[4495169] 

gi|4495167|cmb|AJ240627.1|SPN240627 
[4495167] 

gi|4495l65jemb|AJ240626.1|SPN240626 
[4495165] 

gi|4495 1 63|cmb|AJ240625. 1 (SPN240625 
[4495163] 

gi|4495 1 61 |cmb|AJ240624. 1 |SPN240624 
[4495161] 

gi|4495 1 59|emb|AJ240623. 1 |SPN240623 
[4495159] 

gi|4495157|cmb|AJ240622.1|SPN240622 
[4495157] 

giJ4495 1 55|emb|AJ24062 1 . 1 ISPN24062 1 
[4495155] 

gi|4495153|emb|AJ240620. 1 |SPN240620 
[4495153] 

gi|4495151|emb|AJ240619.1|SPN240619 
[4495151] 

gi|4495149|cmb|AJ240616.1|SPN240616 
[4495149] 

gi|4495 1 47|emb|AJ2406 15.1 |SPN2406 1 5 
[4495147] " 1_ 

gi|4495 145|cmb|AJ240614. 1|SPN240614 
[4495145] 

gi|4495143|cmb|AJ240613.1|SPN240613 
[4495143] 
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gi(4495 ! 4 i femb| A J2406 12.1 ISPN2406 1 2 
[4495141] 

gi|4495 1 39|erabj A J2406 1 1 . 1 ]SPN2406 1 1 
(4495139] 

gi|4495 1 37|emb|AJ2406 1 0. 1 ISPN2406 1 0 
[4495137] 

gi|4495 135Jemb|AJ240609. 1ISPN240609 
[4495135] 

gi|4495133|embjAJ240608.l|SPN240608 
[4495133] 

gi|4495 1 3 1 |emb[AJ240607. 1 [SPN240607 
[4495131] 

gi|4495 129|emb|AJ240606. 1 JSPN240606 
[4495129] 

gi|4883698|gb|AF079807. 1 1 AFO79807 [4883698] 

gi|4838562|gb|AF145055.1|AF145055 [4838562] 

gi|4063727|gb|L29324.1|STRINTE [4063727] 

gi|3093401|cmb|AJ005619.1|SPAJ5619 [3093401] 

gi|4103889|gb|AF029368.1|AF029368 [4103889] 

gi|2897689|dbj|D63805. 1 |D63805 [2897689] 

gi|4566771|gb|AF11774I.l|AF11774l [4566771] 

gt[4566768|gb|AFI 1 7740. 1 |AF II 7740 [4566768] 

gi|4538836|emb|AJ240793. 1 JSPN240793 
[4538836] 

gi|4538832|erab|AJ240792. 1 [SPN240792 
[4538832] 

gij4538828|emb|AJ24079 1 . 1 |SPN240791 
[4538828] 

gi|4538S24|emb!AJ240790. 1 (SPN240790 
[4538824] 

gi|4538821|emb|AJ240789.1|SPN240789 
[4538821] 

gi(45388 1 8ferab|AJ240788. 1 (SPN240788 
[4538818] 

gil4538815|cmb|AJ240787.1|SPN240787 
[4538815] 

gi|4538812|cmb[AJ240786.1|SPN240786 
[4538812] 

gi]4538809|embJAJ240785.1|SPN240785 
[4538809] 

gi|4538806|cmb|AJ240784. 1 |SPN240784 
[4538806] 

gij4538803|emb|AJ240783 . 1 [SPN240783 
[4538803] 

gi|4538800|cmb|AJ240782.1|SPN240782 
[4538800] 



gi(4538797|embiAJ24078l.llSPN240781 
[4538797] 

gi|4538794|emb|AJ240780.1|SPN240780 
[4538794] 

gi|453 879 1 |emb| AJ240779. 1 |SPN240779 
[4538791] 

gij4538788|embjAJ240778. 1 |SPN240778 
[4538788] 

gi|4538785|emb|AJ240777.1|SPN240777 
[4538785] 

gi|4538782|cmb|AJ240776. 1 |SPN240776 
[4538782] 

gil4538779|cmb|AJ240775.1|SPN240775 
[4538779] 

gi(4538776|emblAJ240774.1 (SPN240774 
[4538776] 

gi|4538773|embJAJ240773. 1|SPN240773 
[4538773] 

gi|4538770|emb|AJ240772.1|SPN240772 
[4538770] 

gi|4538767|cmblAJ240771.1|SPN24077I 
[4538767] 

gi|4538764|emb|AJ240770.1|SPN24O77O 
[4538764] 

gi|453876 1 |emb|AJ240769. 1 (SPN240769 
[4538761] 

gi|4538758{emb|AJ240768.i|SPN240768 
[4538758] 

gi(4538755Icmb|AJ240767. 1|SPN240767 
[4538755] 

gi[4538752|emb|AJ240766.1(SPN240766 
[4538752] 

gi|4538749|cmb|AJ240765.1|SPN240765 
[4538749] 

gi|4538746|emb|AJ24076U[SFN240761 
[4538746] 

gij4538743|«ub|AJ240760. 1 (SPN240760 
[4538743] 

gi|4538740|cmb|AJ240759.1|SPN240759 
[4538740] 

gi|4538737|cmb|AJ240758. 1 (SPN240758 
[4538737] 

gi|45387341emb|AJ240757.1|SPN240757 
[4538734] ■ 

gi|453873 1 |emb|AJ240756. 1 [SPN240756 
[4538731] 

gi|4538728]emb|AJ240755. 1 (SPN240755 
[4538728] 
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gi|4538725|emb|AJ240754.1|SPN240754 
[4538725] 

giS4538722)emb!AJ240753.1ISPN240753 
[4538722] 

gi|45387 1 9|emb|A J240752. 1 [SPN240752 
[4538719] 

gi|4538716|emblAJ240751.1|SPN240751 
[4538716] 

gi!45387 13|cmb|AJ240750. 1 |SPN240750 
[4538713] 

gi|45387 10|emb|AJ240749. 1 |SPN240749 
[4538710] 

gi|4538707}erob|AJ240748.1|SPN240748 
[4538707] 

gi|4538704|emb|AJ240747. 1 JSPN240747 
[4538704] 

gi|4538701|emb]AJ240746.1|SPN240746 
[4538701] 

gil4538698[emb|AJ240745. 1 |SPN240745 
[4538698] 

giJ4538695!emb| AJ240744. 1 |SPN240744 
[4538695] 

gi|4538692|emb|AJ240743 . 1 |SPN240743 
[4538692] 

gi|4538689|emb|AJ240742. 1 |SPN240742 
[4538689] 

gi|4538686|crab|AJ240741.1)SPN240741 
[4538686] 

gi(4538683(cmb|AJ240740. 1 (SPN240740 
[4538683] 

gi|4538680|emb|AJ240739. 1 (SPN240739 
[4538680] 

giJ4538677|emb|AJ240738.1|SPN240738 
[4538677] 

gi|4530444|gb|Af 1 18229.1|AF1 18229 [4530444] 
gi|4519253|dbj|ABO15852.1|AB015852 [4519253] 
giI4519251|dbj|ABO15851.1|AB015851 [4519251] 
gij45 1 9249|dbj |AB0 1 5 850. 1 1 ABO 15850 [4519249] 
gi|45l9247|dbj|AB015849.1(AB015849 [4519247] 
gi|45I9245|dbj|ABOl5848.1|AB015848 [4519245] 
gi|4519243|dbj|AB015847.1|AB015847 [4519243] 
gi|4519241|dbj|AB015846.1|AB015846 [4519241] 
gi|4519239|dbj|AB01 1210J|AB01 1210 [4519239] 
gi|4519237|dbj|AB011209.1|AB01 1209 [4519237] 
giI4519235|dbj|AB01 1208.1|AB01 1208 [4519235] 



gi|45l9233|dbj|AB011207.1|AB01 1207 [4519233] 

gi|45 1 923 1 |dbj| ABO 11206.1 1 ABO 1 1 206 [45 1 923 1 ] 

gi|45 1 9229|dbj|AB0 11205.1 1 ABO 1 1 205 [45 1 9229] 

gi|45 19227|dbj|AB01 1 204. 1 1 ABO 11204 [45 1 9227] 

gij45 19225|dbj|AB0 1 1 203. 1 1 ABO 1 1 203 [45 1 9225] 

gi|4519223|dbj|AB011202.1|AB01 1202 [4519223] 

gi)4519221|dbj|AB0H201.1|AB011201 [4519221] 

gi|45l9219|dbjlAB01 1200.1|AB011200 [4519219] 

gi!45 l9217|dbj|AB01 1 199.1|AB01U99 [4519217] 

gi|4519215|dbj[AB0ni98.1|AB0ni98 [4519215] 

gi|4495l27jemb|AJ240605.1|SPN240605 
[4495127] 

gi|446803 1 |emb|AJ 1 32957. 1 |SPN 1 32957 
[4468031] 

gi|44680291cmb|AJ132956.1|SPN132956 
[44680291 

giJ42 1 8532[cmbIAJ0 103 1 2. 1 [SPNO 1 03 1 2 
[4218532] 

gi|4456852|emb|AJ236792. 1 fSPN236792 
[4456852] 

gi|4456850|«nblAJ23679 1 . 1JSPN23679 1 
[4456850] 

gij4456848|emb|AJ236790. 1 (SPN236790 
[4456848] 

gi|44568461emb]AJ236789.1|SPN236789 
[4456846] 

gi|3550644[erab|AJ006987. 1 (SPAJ6987 [3550644] 

gi|3550625|emb|AJ006986. 1 ISPAJ6986 [3550625] 

gi|4416518!gb|AF014458.2|AF014458 [4416518] 

gi|4406260|gb|AFl 05 1 1 6. 1 JAF 1 05 1 1 6 [4406260] 

gi|4406257|gb|AP105115.1|AF1051 15 [4406257] 

gi|4406254|gb|Af 1 05 1 1 4. 1 |AF 1 05 1 14 [4406254] 

gi|4406246|gb|AFl05113.1|AF105113 [4406246] 

gi|4406243 IgbjAF 1 05 II 2 . 1 1 AF 1 05 1 12 [4406243] 

gi|4 138533[emb|AJ0058 1 5. 1 [SPN58 1 5 [4 1 38533] 

gi|382 1 726|emb|AJ232433. 1 (SPN232433 
[3821726] 

gi|3821724|emb|AJ232432.1|SPN232432 
[3821724] 

gi|3821722[emblAJ232431.1|SPN232434~ * - 
[3821722] 

gi|382 1 720|cmb|AJ232430. 1JSPN232430 
[3821720] 
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gi|382 1 7 1 8|emb| A J232429. 1 [SPN232429 
[3821718] 

gi|382I716|emb|AJ232428.1|SPN232428 
[3821716] 

gi|382 1 7 14|emb|AJ232427. 1 [SPN232427 
[3821714] 

gi|382 1 7 1 2|cmb|AJ23 2426. 1 |SPN232426 
[3821712] 

gi|3821710|cmb|AJ232425.1|SPN232425 
[3821710] 

gi|382 1 708|embl AJ232424, 1 ISPN232424 
[3821708] 

gi|3821706|emb|AJ232423.1|SPN232423 
[3821706] 

gi|3821704|emb|AJ232422. 1 |SPN232422 
[3821704] 

gij38217021emb!AJ232421 . 1|SPN23242 1 
[3821702] 

gi|382l70O|cmb(AJ232420.l!SPN232420 
[3821700] 

gi|3821698|erob(AJ2324l9.1|SPN2324l9 
[3821698] 

gi|3821696jemb|AJ232418.1|SPN232418 
[3821696] 

gi)382l694|eirib|AJ232417.1}SPN232417 
[3821694] 

gi!3821692}emblAJ232416.1!SPN232416 
[3821692] 

gij3821690|emb|AJ232415.1|SPN232415 
[3821690] 

gi|3821688|emb|AJ232414.1|SPN232414 
[3821688] 

gi[3821686|emb|AJ2324I3.I|SPN232413 
[3821686] 

gi|3821684|crob|AJ232412.1|SPN232412 
[3821684] 

gi|3821682|emb|AJ23241 l.ljSPN23241 1 
[3821682] 

gi|3821680|emb|AJ232410.1[SPN232410 
[3821680] 

gi|3821678|emb|AJ232409.1|SPN232409 
[3821678] 

gp821676|cmb]AJ232408.1|SPN232408 
[3821676] 

giP82l674|cmb|AJ232407.1|SPN232407 
[3821674] 

gi|3821672|emb|AJ232406.1|SPN232406 
[3821672] 



gi|382 1 670|emb|AJ232405. 1 ISPN232405 
[3821670} 

gi|382 1 668|cmb|AJ232404. 1 ISPN232404 
[3821668] 

gi|382 1 666|emb|AJ232403. 1 ISPN232403 
[3821666] 

gi|382l664|emb|AJ232402. 1 |SPN232402 
[3821664] 

gi|382 1 662|emb|A J23240 1 . 1 ISPN23240 1 
[3821662] 

gi|3821660|emb|AJ232399. 1 |SPN232399 
[3821660] 

gi|382 1 658|embfAJ232398. 1 (SPN232398 
[3821658] 

gi|382 1 656|emb|A J232397. 1 (SPN232397 
[3821656] 

gi|3821654|emblAJ232396.1|SPN232396 
[3821654] 

gi|382l652|emb|AJ232395.i|SPN232395 
[3821652] 

gij3821650|cmblAJ232394.1|SPN232394 
[3821650] 

gii3821648|«nb|AJ232393.1|SPN232393 
[3821648] 

gi|3821 646|emb|AJ232392. 1|SPN232392 
[3821646] 

gi|3821 644|emb|AJ23239 1 . 1 JSPN232391 
[3821644] 

gi|3821642|cmb|AJ232390.1|SPN232390 
[3821642] 

gif382 1 6401emb|AJ232389. 1 |SPN232389 
[3821640] 

gij3821638|emb(AJ232388. 1 (SPN232388 
[3821638] 

gi|3821 636|cmblAJ232387. 1 (SPN232387 
[3821636] 

gi|3821634|emb|AJ232386.I)SPN232386 
[3821634] 

gi|382 1 632|emb|AJ232385. 1 ISPN232385 
[3821632] 

gi!3821630|emb(AJ232384.1|SPN232384 
[3821630] 

gi|3 82 1 628|cmb|AJ2323 83. 1 |SPN232383 
[3821628] 

gi|3821626|embfAJ232382.1|SPN232382 
[3821626] 

gi|382 1 624|cmb|AJ23238 1 . 1 |SPN23238 1 
[3821624] 
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gi|382 1 622|cmb| A J2323 80. 1 ISPN232380 
[3821622] 

gi|382 1 620|embl AJ232379. 1 |SPN232379 
[3821620] 

gi|3821618|embtAJ232378.1|SPN232378 
[3821618] 

gi|3821616|emb|AJ232377.1jSPN232377 
[3821616] 

gi|3821614|emb|AJ232376.1|SPN232376 
[3821614] 

gi|3821612|emb|AJ232375.1!SPN232375 
[3821612] 

8 i{382l610|embjAJ232373.1|SPN232373 
[3821610] 

gi|3821608|emb|AJ232372.1|SPN232372 
[3821608] 

gi|382 1 606|emb|AJ23237 1. 1 |SPN232371 
[3821606] 

gi|3821604|emb|AJ232370.1|SPN232370 
[3821604] 

gi[382 1 602|emb| AJ232369. 1 |SPN232369 
(3821602] 

gi|3821600|emb|AJ232368.1|SPN232368 
[3821600] 

gi|382 1 598|emb|AJ232367. 1 [SPN232367 
[3821598] 

gi|382 1596|emb|AJ232366. 1 [SPN232366 
[3821596] 

gi|3821594|cmb|AJ232365.1|SPN232365 
[3821594] 

gi|3820454|emb|AJ007367.1 JSPN7367 [3820454] 

gij3821592|emb|AJ232364.1jSPN232364 
[3821592] 

gil3821590|cmb|AJ232363.1|SPN232363 
[3821590] 

gi|3821588|emb|AJ232362.1|SPN232362 
[3821588] 

gi|3821586|cmb|AJ232361.1|SPN232361 
[3821586] 

gi|3821584|emb|AJ232360.1|SPN232360 
[3821584] 

gi|3821582|cmb|AJ232359.1!SPN232359 
[3821582] 

gi|3821580i«nb|AJ232358. 1 (SPN232358 
[3821580] 

gi|3821 578|cmb[AJ232357. 1 |SPN232357 
[3821578] 



gi|382 1 576|cmb| A J232356. 1 |SPN232356 
[3821576] 

gi|3821574lemb|AJ232355.1|SPN232355 
[3821574] 

gi|3821572|emblAJ232353.1|SPN232353 
[3821572] 

gi|382 1 570|erab|AJ232352. 1 ISPN232352 
[38215701 

gi|3821568|emb|AJ232351.1|SPN232351 
[3821568] 

gi|3821566|emb|AJ232350.1|SPN23235O 
[3821566] 

gi|3 82 1 564|emb| AJ232349. 1 |SPN232349 
[3821564] 

gij3 82 1 562|emb(AJ232348.1 (SPN232348 
[3821562] 

gi|3821560|emb|AJ232347.11SPN232347 
[3821560] 

gi}382 1 558|emb| AJ232346. 1 ISPN232346 
[3821558] 

gi|382 1 556|emb|AJ232345 . 1 |SFN232345 
[3821556] 

gi|382 1 554|emb|AJ232344. 1 |SPN232344 
[3821554] 

gi|3821552|cmb|AJ232343.1|SPN232343 
[3821552] 

gi|382 1 550|emb|AJ232342. 1 |SPN232342 
[3821550] 

gi|382 1 548|emb[AJ23234 1 . 1 |SPN23234 1 
[3821548] 

gi|382 1 546|erab|AJ232340. 1 |SPN232340 
[3821546] 

gi|3 821 544|emb|AJ232339. 1 ISPN232339 
[3821544] 

gi|3821542|erab|AJ232338.1|SPN232338 
[3821542] 

gi|382 1 540|emb|AJ232337. 1 |SPN232337 
[3821540] 

giJ3821 538|cmb|AJ232336. 1 ISPN232336 
[3821538] 

gi|3821536|emb|AJ232335.1JSPN232335 
[3821536] 

gi|3821534|crab|AJ232334.1|SPN232334 
[3821534] 

gi|3821532|emb|AJ232333.1|SPN2l2333 
[3821532] 

gi|3821 530|cmb|AJ232332. 1 ISPN232332 
[3821530] 
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gi|382 1 528|embl A J23233 1 . 1 JSPN23233 1 
[3821528] 

gi|382 1 526|emb! AJ232330. 1 (SPN232330 
[3821526] 

gi|382 1 524|emb| A J232329. 1 ISPN232329 
[3821524] 

gi|3821522|emblAJ232328.1|SPN232328 
[3821522] 

gi|3821520|emb|AJ232327.1|SPN232327 
[3821520] 

gi|382 1 5 1 8|emb| A J232326. 1 |SPN232326 
[3821518] 

gi[3821516|emb|AJ232325.1|SPN232325 
[3821516] 

gi|3821514|cmb|AJ232324.1|SPN232324 
[3821514] 

gi|3821512|cmb|AJ232322.1|SPN232322 
[3821512] 

gi|3821510|emb|AJ23232U|SPN232321 
[3821510] 

gi|3821508|emb|AJ232320.1|SPN232320 
[3821508] 

gi|3821506|emb|AJ232319.1|SPN232319 
[3821506] 

gi|3821504|emb|AJ232318.1|SPN232318 
[3821504] 

gi(3 82 1 502|emb| AJ2323 17.1 |SPN2323 1 7 
[3821502] 

gi|3821 500|cmb|AJ2323 16.1|SPN2323 1 6 
[3821500] 

gi|3821 498|cmb|AJ2323 15. 1 (SPN2323 1 5 
[3821498] 

gi|3821496|emb|AJ2323I4.1|SPN232314 
[3821496] 

gi|3821494|cmb|AJ2323I3.1|SPN232313 
[3821494] 

gii3821492|emb|AJ2323 12.1|SPN232312 
[3821492] 

gi|382 1 490|cmb[AJ2323 11.1 (SPN2323 1 1 
[3821490] 

gi|3821488|emb|AJ232310.!|SPN23231O 
[3821488] 

gi|3821486Iemb|AJ232309.1|SPN232309 
[3821486] 

gi|3821484|emb|AJ232308.1|SPN232308 
[3821484] 

gi|382 1 482|emb| AJ232307. 1 (SPN232307 
[3821482] 



gi|3821480|emb|AJ232306.1ISPN232306 
[3821480] 

gi|3821478lerab|AJ232305.1|SPN232305 
[3821478] 

gi|3821476|emb|AJ232304.1|SPN232304 
[3821476] 

gi|3821474|emb|AJ232303.1|SPN232303 
[3821474] 

gi|3821472|emb|AJ232302.1|SPN232302 
[3821472] 

gi|3821470|emb|AJ232301.1|SPN232301 
[3821470] 

gi|3821468|emb|AJ232300.1|SPN232300 
[3821468] 

gi|382 1466Jemb|AJ232299. 1 (SPN232299 
[3821466] 

gi|3821464|erab(AJ232298.1|SPN232298 
[3821464] 

gi|3821462|emb|AJ232297.l|SPN232297 
[3821462] 

gi|382 1460|emb|AJ232295. 1 |SPN232295 
[3821460] 

gi|3821458|emb|AJ232294.1|SPN232294 
[3821458] 

gi|3821456|emb|AJ232293.1|SPN232293 
[3821456] 

gi|3821454|cmb|AJ232292.1|SPN232292 
[3821454] 

gi|382 1452|emb|A J23229 1 . 1 |SPN23229 1 
[3821452] 

gi|3821450|emb|AJ232290.1|SPN232290 
[3821450] 

gi|3821448|cmb|AJ232289.1jSPN232289 
[3821448] 

gi|382I446|emb|AJ232288.1|SPN232288 
[3821446] 

gi|3821444|emb|AJ232287.1|SPN232287 
[3821444] 

gi|3821442|cmb|AJ232286.1|SPN232286 
[3821442] 

gi|3821440|emb|AJ232285.1|SPN232285 
[3821440] 

gi|382 1 438|emb| AJ232284. 1 JSPN232284 
[3821438] 

gi|3821436|erab|AJ232283.1|SPN232283 
[3821436] 

git3821434|cmb|AJ232282.1 |SPN232282 
[3821434] 
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gi|382l4321emb|AJ232281.1|SPN232281 
[3821432] 

gi|382l430|emb|AJ232280.1|SPN232280 
[3821430] 

gi)38214281embjAJ232279.1|SPN232279 
[3821428] 

gi|3821426(emb|AJ232278.i|SPN232278 
[3821426] 

gi|3821424}cmb|AJ232276.1|SPN232276 
[3821424] 

gi|3821422|emb(AJ232275.1|SPN232275 
[3821422] 

gi|3821420|cmb|AJ232274.1|SPN232274 
[3821420] 

gi|382I4l81cmb|AJ232273.I|SPN232273 
[3821418] 

gi|3821416|crab|AJ232272.1|SPN232272 
[3821416] 

gi|382 1 4 14|emb|AJ23227 1 . i (SPN23227 1 
[3821414] 

giJ3821412Jemb|AJ232270.1|SPN232270 
[3821412] 

gi[3821410[emb|AJ232269J|SPN232269 
[3821410] 

gi|3821408|emb|AJ232268. 1 (SPN232268 
[3821408] 

gi|3821406|embjAJ232267.1[SPN232267 
[3821406] 

giJ3821404|emb|AJ232266.1|SPN232266 
[3821404] 

gi|3821402(crabiAJ232265.I|SPN232265 
[3821402] 

gi|3821400|emblAJ232264.l|SPN232264 
[3821400] 

giP82 1398|emb|AJ232263 . 1 (SPN232263 
[3821398] 

gip82l396|cmb|AJ232262.1|SPN232262 
[3821396] 

gi|382 1 394{emb[AJ232261 . 1 |SPN23226 1 
[3821394] 

giJ3821392|emb|AJ232260.1|SPN232260 
[3821392] 

gi|3821390|erab|AJ232259.1 |SPN232259 
[3821390] 

gi|3821388|emb|AJ2322S8.llSPN232258 
[3821388] 

gi|382 1 386{emb|AJ232257. 1 (SPN232257 
[3821386] 



gi|3821384|emb|AJ232256.1!SPN232256 
[3821384] 

gi|3 82 1 382(emb| AJ23 2255.1 (SPN232255 
[3821382] 

gi|3821380|emb}AJ232254.1|SPN232254 
[3821380] 

gi|3821378|emb|AJ232253.I|SPN232253 
[3821378] 

gi|3821376|emb|AJ232252.11SPN232252 
[3821376] 

gi|3821374|cmbiAJ23225 1 . 1 |SPN23225 1 
[3821374] 

gi|3821372Jcmb!AJ232250.1JSPN232250 
[3821372] 

gij3821 370|emb|AJ232249. 1 (SPN232249 
[3821370] 

gi!3821367|emb|AJ232248.1|SPN232248 
[3821367] 

gi|3821365(cmb|AJ232247.1fSPN232247 
[3821365] 

gi|3821363|cmblAJ232246. 1 |SPN232246 
[3821363] 

gi|382I36l|emb|AJ232245.I|SPN232245 
[3821361] 

giJ3821359|emb|AJ232244.1JSPN232244 
[3821359] 

gi|3821357i«anb|A/232243.1(SPN232243 
[3821357] 

gi]3821355]emb|AJ232241.1|SPN232241 
[3821355] 

gi|292 1 842|gb(AF047385. 1 JAf 047385 [2921842] 

gi]29O9863)gb|AF047696.1jAF047696 [2909863] 

gi|4 193353|gb(AFO55088. 1 (AF055088 [4 193353 ] 

gi|4185242|gb|AH007276.1[SEG SPTNJUNC 
[4185242] 

gi|4 1 8524 1 |gb|AF066797. 1 (SPTNJUNC2 
[4185241] 

gi|4 1 85240|gb|AF066796. 1 |SPTNJUNC 1 
[4185240] 

gi|4097979{gb|U72655. 1 |SPU72655 [4097979] 
gi|4063720|gblU9323.1|STRMTR [4063720] 
gt|1657605|gb[U66846.1|SPU66846 [1.657605] 
g qi657602|gb|U66845.11SPU66845L[1657602] 
gt|4009485{gb{AF068903.1(AF068903 [4009485] 
gi|4009477|gb|AF068902. 1 1 AF068902 [4009477] 
gi|4009462[gbjAF068901 . 1 1AF068901 [4009462] 
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gi|3947767|emb|AJ233896. 1 ISPN233896 
(3947767] 

gi!3947765|emb}AJ233895.1|SPN233895 
[3947765] 

gi|3947763|emb|AJ233894.1|SPN233894 
[3947763] 

gi|394776 1 Jemb) AJ233893 . 1 (SPN233893 
[3947761] 

gi|3947759|emb|AJ233892.l|SPN233892 
[3947759] 

gi|3947757|emb|AJ23389 1 . 1 (SPN23389 1 
[3947757] 

gi|3947755[emb|AJ233890. 1 |SPN233890 
[39477S5] 

gi|3947753|erob|AJ233889.1|SPN233889 
[3947753] 

gi|394775l|emb|AJ233888.1|SPN233888 
[3947751] 

gi]3947749|emb|AJ233887.1|SPN233887 
[3947749] 

gi|3947730[emb(AJ233886.1|SPN233886 
[3947730] 

gi|375889 1 |emb|Z71 552. 1 (SPADCA [375889 1 ] 
gi|3818479|gbjAF057294.1|AF057294 [3818479] 
gi|2351767|gb[U8971 U|SPU8971 1 [2351767] 
giP39566i|dbj|AB006879.1tAB006879 [3395661] 
giJ33956591dbj|AB006878.11AB006878 [3395659] 
gij3395657|dbjlAB006877.1!AB006877 [3395657] 
gi|3395655|db;|AB0O6876.1|AB0O6876 [3395655] 
gi|3395653|dbj|AB006875.1|AB006875 [3395653] 
gi|339565 1 [dbj|AB006874. 1 [AB0O6874 [339565 1 ] 
gi|3395649(dbj|AB006873 . 1 |AB006873 [3395649] 
gi|3395647|dbj|ABO06872. 1 1AB006872 [3395647] 
gii3395645|dbj(ABO0687 1 . 1 [AB00687 1 [3395645] 
gi|3395643|<fcjfAB006870.1|AB006870 [3395643] 
giJ339564l|dbj|ABOO6869.l!ABO06869 [3395641 ] 
gi|3395639|dbj|AB006868.1|ABO06868 [3395639] 
gi|2315992|gb|U87092.1[SPU87092 [2315992] 
gi|2209338|gb(U93576. 1 |SPU93576 [2209338] 

gi|2 l09442|gb|AF000658. 1 (SPDNAARG 
[2109442] 

gi)1881538|gb|U09239.1|SPU09239 [1881538] 
gi|1666904|gb|U76218.1|SPU76218 [1666904] 
git 1 6 1 3766|gb|U333 15. 1 |SPU333 1 5 [ 1 61 3766] 



gi|1498294|gb|U41735.1|SPU41735 [1498294] 
gi|1213493|gb|U47687.1|SPU4?687 [1213493] 
gi|1163109|gb|U43526.I|SPU43526 [1163109] 
gi|556001|gb|U15171.1|SPU15171 [556001] 
gi|455063igb|U02920.I|SPU02920 [455063] 
gi|784896|gb|L36923.1|STRSTRH [784896] 
gi)332O386]gb|AFO30373.1|AF03O373 [3320386] 
gi|2804772]gb|AF030374. 1 JAP030374 [2804772] 
gi!2804762|gb|AF030372.1|AF030372 [2804762] 
gi|2804 756|gb| AF0303 71.1 1 AF0303 7 1 [2804756] 
gi|280475O|gb(AFO3037O.l|AF03037O [2804750] 
gi|2804745|gb|AF030369. 1 (AF030369 [2804745] 
gi|28047391gb|AFO30368.11AF030368 [2804739] 
gi|2804732Jgb|AF030367. 1 }AF030367 [2804732] 
gi|28O4726|gblAFO3O366.1|AF030366 [2804726] 
gi|28O472O|gb|AF03O365. 1 [AF030365 [2804720] 
gi|28O4713|gb|AFO30364.1|AF030364 [2804713] 
gi|2804707|gb|AF030363. 1 (AF030363 [2804707] 
gil280470l|gb|AF030362.1(AF030362 [2804701] 
gi|2804694|gb|AF030361 .1 (AF030361 (2804694] 
gi|28046881gblAF030360.1|AF030360 [2804688] 
gi|28046821gb|AF030359.1|AF030359 [2804682] 
gi|3550979|dbjjAB010387.1|AB010387 [3550979] 
gi|2275100|cmbiAJ000336.1|SPR6LDH [2275100] 
gi|355 1 853|gb|AF076029. 1 ]AF076029 [3551 853] 
gi|3551 773|gb|U94770. 1 (SPU94770 [355 1773] 
gi|35506 1 7|emb|AJ004869. 1 (SPAJ4869 [355061 7] 
gi|35 13563 jgb|AF055727. 1 (AF055727 [35 13563] 
gi|35l356l|gblAF055726.1|AF055726 [3513561] 
giJ3513559|gb|AF055725.11AF055725 [3513559] 
gi|35 1 3557|gb|AF055724. 1 |AF055724 [35 13557] 
gi|3513555jgb|AF055723.1|AF055723 [3513555] 
gi|35 13553|gb|AF055722. 1 IAF055722 [35 1 3553] 
gi|35 13549|gb|AF05572 1 . 1 (AF05572 1 [35 1 3549] 
gi|35l3545|gb|AF055720.1|AF055720 [3513545] 
gi[1914869|emb(282001.1|SPZ8.200t (19148694 - 
gil2911421|gblAF046238.1|AF046B8T29n421] 
gi|29U419|gb|AF046237.1|AF046237 [2911419] 
gi|291 141 7|gb|AF046236. 1 |AF046236 [291 1417] 
gi|291 1415tgb|AF046235.1|AF046235 [291 1415] 
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gi)29 1141 3]gb| AF046234. 1 1 AF046234 [2911413] 
gi|291 1411 [gb| AF04623 3 . 1 1 AF04623 3 [291 141 1] 
gi|2911409|gb|AF046232.1!AF046232 [2911409] 
gi|29 1 1 407|gb|AF04623 1 . 1 1AF04623 1 (29 1 1 407] 
gil2911405|gb|AF046230.1|AF046230 [2911405] 
gi|325860l|gb|U40786.1tSPU40786 [3258601] 
gij32 1 1 756|gb|AF0S2209. 1 f AF052209 [3211 756] 
gi|321 1752|gb|AF052208.1|AF052208 [3211752] 
gi}32 1 1 747|gb|AF052207. 1 (AF052207 [3211 747] 
gi|3220194|gb|AF053 12 1. 1|AF053 121 [3220194] 
gi|2766052|emb|Z99863.1|SPZ99863 [2766052] 
gi|2766050|cmb(Z99862.1|SPZ99862 [2766050] 
gi|2766048|emb|Z9986U|SPZ99861 [2766048] 
gi|2766046|emb|Z99860.1|SPZ99860 [2766046] 
gi|2766044|emb|Z99859.1|SPZ99859 [2766044] 
gi|2766042|emb|Z99858.1|SPZ99858 [2766042] 
gi|2766040|emb|Z99857. 1 |SPZ99857 [2766040] 
gij2766038|emb|Z99856. l|SPZ99856 [2766038] 
gil2766036!emb!Z99855.1|SPZ99855 [2766036] 
gi|2766034|erab|Z99854.1|SPZ99854 [2766034] 
gi|2766032|emb|Z99853.1 [SPZ99853 [2766032] 
gi|2766030|emb|Z99852.1|SPZ99852 [2766030] 
gi|2766028|emb|Z9985 1 . 1 (SPZ9985 1 [2766028] 
gi(2766026femb jZ99850. 1 (SPZ99850 [2766026] 
gi[2766024{emb|Z99849. 1 (SPZ99349 [2766024] 
gi}2766022|embjZ99848.1JSPZ99848 [2766022] 
gi|2766020|emb|Z99847.1|SPZ?9847 [2766020] 
gi{27660 1 8|emb|Z99846. 1 |SPZ99846 [27660 1 8] 
gi|2766016|emb|Z99845.1|SPZ99845 [2766016] 
gi|2766014|etnb|Z99844.1[SPZ99844 [2766014] 
gi|27660121cmb|Z99843.1|SPZ99843 [2766012] 
gi|2766010[emb(Z99842.1|SPZ99842 [2766010] 
gi|2766008|«nb{Z9984 1 . 1 (SPZ9984 1 [2766008] 
gi|2766006|emb|Z99840.1|SPZ99840 [2766006] 
gi|2766004|emb|Z99839. 1|SPZ99839 [2766004] 
gi|2766002|cmb|Z99838.1|SPZ99838 [2766002] 
gi[2766000i«nb|Z99837. 1 |SPZ99837 [2766000] 
gi|2765998|wnb|Z99828. 1 [SPZ99828 [2765998] 
gi|2765996|embtZ99827. 1 }SPZ99827 [2765996] 
gi|2765994jembfZ99826.1!SPZ99826 [2765994] 



gi|2765992|emb|Z99825.1(SPZ99825 [2765992] 

gi|2765990|cmb|Z99824. 1|SPZ99824 [2765990] 

gii2765988|cmb|Z99823. 1ISPZ99823 [2765988] 

gi(2765986|emb|Z99822.1jSPZ99822 [2765986] 

gi|2765984|emb|Z99821.1|SPZ99821 [2765984] 

gi|2765982|emb|Z99820. 1|SPZ99820 [2765982] 

gi|2765980|emb|Z998 1 9. 1 \SPZ998 1 9 [2765980] 

gi|2765978Jemb|Z998 1 8. 1 JSPZ998 1 8 [2765978] 

gi|2765976jembJZ998 17.1 JSP2998 1 7 [2765976] 

gi|2765974|cmb|Z99816.1|SPZ99816 [2765974] 

gi|2765972|cmb|Z99815.11SPZ99815 [2765972] 

gi|2765970|cmb|Z998 14. 1|SPZ998 14 [2765970] 

gi|2765968|emb|Z998l3.1|SPZ99813 [2765968] 

gi|2765966|emb|Z99812.1|SPZ99812 [2765966] 

gi(2765964|emb|Z998 11.1 (SPZ998 1 1 [2765964] 

gi|2765962|emb|Z998 10. 1JSPZ998 10 [2765962] 

gi)2765960|erob|Z99809.1|SPZ99809 [2765960] 

gi|2765958|emb|Z99808.1|SPZ99808 [2765958] 

gi|2765956|emb|Z99807.1|SPZ99807 [2765956] 

gi|2765954|emblZ99806.1|SPZ99806 [2765954] 

gi|2765952|emb|Z99805.1|SPZ99805 [2765952] 

gi|2765950|emb|Z99804.1|SPZ99804 [2765950] 

gi|2765948{emb|Z99803.1|SPZ99803 [2765948] 

gi|2894I04|cmbpC77249.IlSPR6CIARH [2894104] 

gt|3 1 53897!gb|AF067 1 28. 1 |AF067 1 28 [3 1 53897] 

gi)3 1 527 12|gb)AF065 1 53. 1 |AF065 1 53 [3 1 527 12] 

giP 1 52710|gbJAF0651 52. 1 [AF065 1 52 [3 1 527 10] 

gi|3 1 527O8|gb|AF065 151.1 1 AF065 1 5 1 [3 1 52708] 

gi|3116426|gb[U84387.1|SPU84387 [31 16426] 

gi|2385403|cmblAJ001247.1|SP7465RR3 
[2385403] 

gi|2342540|emb|AJ001250.1|SP7978RR5 
[2342540] 

gi|2342539|emblAJ00125 1.1|SP7978RR3 
[2342539] 

gi|2342538|emblAJ001248.1|SP7466RR5 
[2342538] 

gi|2342537(emb|AJ00 1249.1 |SP746<SRK3- " 
[2342537] 

gi[3065896jgblAF058920. 1 (AF058920 [3065896] 
gi|2982647]cmb|AJ002294. 1JSPAJ2294 [2982647] 



WO 00/32825 



PCT/IB99/02040 



gi|2982645|emb|AJ002293.1|SPAJ2293 [2982645] 
gij2982643|emb|AJ002292.1|SPAJ2292 [2982643] 
gi|298264 1 |emb|AJ00229 1 . 1 [SPAJ229 1 [298264 1 ] 
gi| 1 620466|emb|X99400. 1 jSPDACAO [1 620466] 
gi|2196665|emb|Z84381.1|HSZ84381 [2196665] 
gi|2196663]emb|Z84380.1!HSZ84380 [2196663] 
gi|2196661|ctnb|Z84379.1|HSZ84379 [2196661] 
gi|2 l96659|emb|Z84378. 1 (HSZ84378 [2 1 96659] 
gi|625 1 75jgb|L36 131.1 |STREXP 10A (625 1 75] 
gi|3004945igb|AF036624. 1 jAF036624 [3004945] 
gi|3004943|gb|AF036623. 1 |AF036623 [3004943] 
gi!3004941|gb|AF036622.11AF036622 [3004941] 
gil3004939|gb|AF036621 . 1 1 AF036621 [3004939] 
gii3004937|gb| AF036620. HAF036620 [3004937] 
gi(3004935|gb(AF036619. 1 [AF0366 19 [3004935] 
gi|2370572|emb|Z861 12.1|SPZ861 12 [2370572} 
gi|2765946|erob|Z99802.1|SPZ99802 [2765946] 
gil2398824|emb|Z34303.i|SPCTNREC [2398824] 
gi|2894512|cmblAJ22349l.l|SPPPR3 [2894512] 
gi|2 198539|emb|X8S787. 1 (SPCPS I4E [2 198539] 
gi|2766156|emb|Z99915.I|SPZ99915 [2766156] 
gi|2766154|emb|Z99914.1|SPZ99914 [2766154] 
gi|2766152|emb|Z99913.1|SPZ99913 [2766152] 
gi|2766150|emb|Z999l2.1|SPZ99912 [2766150] 
gi|2766148|emb|Z9991 l.l|SPZ9991 1 [2766148] 
gi|2766I46femb|Z99910.1|SPZ99910 [2766146] 
gi|2766144|erob(Z99909.1|SPZ99909 (2766144) 
gi|2766142|emb|Z99908.1|SPZ99908 [2766142] 
gi|2766140!«nb!Z99907.1jSPZ99907 [2766140] 
gi|2766138|cmblZ99906. 1|SPZ99906 [2766138] 
gi|2766136|emb|Z99905.l|SPZ99905 [2766136] 
gi|2766134|emb|Z999O4.1|SPZ99904 [2766134] 
gi|2766132|emb|Z99903.1|SPZ99903 [2766132] 
gi|2766130|emb|Z99902.1|SPZ99902 [2766130] 
gi|2766128|emb]Z9990U|SPZ99901 [2766128] 
gi|2766126|cmblZ99900.1|SPZ99900 [2766126] 
gi(2766t24|«nb|Z99899. 1|SPZ99899 [2766124] 
gi|2766122|emb|Z99898. 1|SPZ99898 [2766122] 
gi|2766120|«nb|Z99897.1[SPZ99897 [2766120] 
gi|2766118|emb|Z99896.1JSPZ99896 [2766118] 



gi|2766l 16|embiZ99895.1|SPZ99895 [2766116] 
gi|27661 l4icmb(Z99894.1|SPZ99894 (27661 14] 
gi|27661 12|emb|Z99893.1|SPZ99893 [27661 12] 
gi}27661 10|emb|Z99892.1!SPZ99892 [27661 10] 
gi}2766108|emb)Z99891.1fSPZ99891 [2766108] 
gil2766106|emb|Z99890.11SPZ99890 [2766106] 
gi|2766104|emb|Z99889.1|SPZ99889 [2766104] 
gi|2766 1 02|embjZ99888. 1 (SPZ99888 [2766 102] 
gi|2766 1 00|cmb|Z99887. 1|SPZ99887 [2766100] 
gi|2766098|emb|Z99886. 1 (SPZ99886 [2766098] 
gi|2766096|embtZ99885.1!SPZ99885 [2766096] 
gi|2766094|cmb|Z99884. 1|SPZ99884 [2766094] 
gi|2766092|cmb(Z99883.1|SPZ99883 (2766092] 
gi|276609O|cmb{Z99882JjSPZ99882 [2766090] 
gi|2766088|emb|Z9988 1. 1|SPZ9988 1 [2766088] 
gi|2766086|emb|Z99880. 1|SPZ99880 [2766086] 
gi|2766084|emb|Z99879.1|SPZ99879 [2766084] 
gi|2766082|cmb|Z99878.1|SPZ99878 [2766082] 
gil2766080|cmb|Z99877.I|SPZ99877 [2766080] 
gil2766078|emb|Z99876.1|SPZ99876 [2766078] 
gi|2766076|cmb|Z99875.1]SPZ99875 [2766076] 
gi|2766074|emb|Z99874.1|SPZ99874 [2766074] 
gi|2766072|emb|Z99873 . 1 [SPZ99873 [2766072] 
gi|2766070|cmb|Z99872.1|SPZ99872 [2766070] 
gi[2766068|emb|Z9987I.l|SPZ9987t [2766068] 
gi|2766066!«nb|Z99870. 1|SPZ99870 [2766066] 
gi|2766064|emb|Z99869.1|SPZ99869 [2766064] 
gi|2766062|emb|Z99868.1|SPZ99868 [2766062] 
gi|2766060|emb|Z99867.1|SPZ99867 [2766060] 
gi[2766058|emb[Z99866.1|SPZ99866 [2766058] 
gi|2766056|emb|Z99865.1|SPZ99865 [2766056] 
gi|2766054|cmb(Z99864.1|SPZ99864 (2766054] 
gi|2765906|emb|Z99206.1|SPZ99206 [2765906] 
gi|2765904[emb)Z99205 . 1 [SPZ99205 [2765904] 
gi|2765902|cmblZ99204.11SPZ99204 [2765902] 
gi(2765900|emb|Z99203.l(SPZ99203 [2765ft00] 
gi|2765898|emb|Z99202. 1|SPZ99202 [2765898] 
gi|2765896|emb|Z99201.1|SPZ99201 [2765896] 
gi|2765894|emb|Z99200. 1 [SPZ99200 [2765894] 
gi|270863 1 |gb| AF03695 1 . 1 1 AF03695 1 [270863 1 ] 
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gi|886956|emb|Z49097.1|SPCSl 1 12X [886956] 

gi|2656093|gb|L2 1 856. 1 jSTRMALR [2656093] 

gi|2576332|emb|AJ002055.1|SPSPSA47 [2576332] 

gi|2576330|emb|AJ002054.1|SPSPSA2 [2576330] 

gi|2511704|emb|Y10818.1|SPY10818 [2511704] 

gi|1944619|emb|Z83335.1|SPZ83335 [1944619] 

gi|2425108|gb(AF019904.1iAF019904 [2425108] 

gi|2385404|emb|AJ001246.1|SP7465RR5 
[2385404] 

gi|4382 1 3|cmb|Z 1 6082. 1 |PNALIB [4382 13] 

gi|21496I3|gb|U90721.1|SPU90721 [2149613] 

gi|4939 1 |emb|Z2 1 84 1 . 1 |SPPBP2BB [4939 1 ] 

gi|2209207|gb|AF004325. 1JAF004325 [2209207] 

gi|2293061|emb|Z959l4.1|SPZ95914 [2293061] 

gi|2276393|gb|U16156.1|SPU16156 [2276393] 

gi|2 18331 4|gb|AF003930. 1 |AF003930 [2 1 833 14] 

gi|2 1 82093|«nb|X957 1 7. 1 JSPPARECGN 
[2182093] 

gi|984230|emb|Z49095. 1 |SPCS 1 1 1 1 A [984230] 

gi|886954|emb|Z49096.1|SPCS1092X [886954] 

gi|1181613|dbj[D82873.1|STRPBP2BE [1181613] 

gi| 1 1 8 1 6 12|dbj|D8287 1 . 1 |STRPBP2BCZ 
[1181612] 

gi|1181611|dbj|D82870.i|STRPBP2BB2 [1181611] 

gi| 1 1 8 1 579|dbj|D82869. 1 |STRPBP2BA 1 [ 1 1 8 1 579] 

gi| 1 1 8 1 1 92|dbj |D82872. 1 {STRPBP2BD [ 1 1 8 1 1 92] 

gi|575595|dbj|D42075.1|STRPBP2B2 [575595] 

gijl339971jdbj|D42074.1|STRPBP2Bi [1339971] 

gi|2108329|emb|Yl 1463.1|SPDNAGCPO 
[2108329] 

gi| 19441 15|dbj|AB002522. 1|AB002522 [ 1944 1 1 5] 
gij 1 666669|cmb!Z77727. 1 |SPIS 138 1 C [ 1 666669] 
g i| 1 666668|cmb|Z77726. 1 |SPIS 1 38 1 B [ 1 666668] 
gi| 1 666667jcmb|Z77725. 1 (SPIS 1 38 1 A [ 1 666667] 
gi| 1 914873jcmb|Z82002. 1 |SPZ82002 [ 1 9 14873] 
gi|1431584|cmb|Z74778.1|SPDHFR [1431584] 
gi|47452|emb|Z15120.1|SPSTRG [47452] 
gi|581717|emb|Z12159.1|SPCP131G [581717] 
gi|47342|cmbpC17337.1|SPAMILOC [47342] 
gi| 1 800300|gb|U83667. 1 (SPU83667 [ 1 800300] 
gi| 1 532066|cmb|Y07780. 1 jSPTETOGEN [ 1 532066] 



gt|l 161269|gb|L39074.1|STRSPXB [1 161269] 

gi|1460093|erab|X94909.1|SPIGAlPRT [1460093] 

gi| 1 750263|gb|U72720. 1 |SPU72720 [ 1 750263] 

gi|298649|gb|S56948. 1IS56948 [298649] 

gi|254537|gb|S4351 1.1|S43511 [254537] 

gif245227Jgb|S8 105 1 . 1 |S8 105 1 [245227] 

gi|245226|gb|S8 1 045 . 1 |S8 1 045 [245226] 

gi|245225|gb|S81043.1|S81043 [245225] 

gi|l 150618|emb|Z49988.1|SPMMSAGEN 
[1150618] 

gi|47456|emb|X01138.11SPTN917A [47456] 

gi|1658316|emb[Z47210.1|SPDEXCAP [1658316] 

gij 1 550802|embjX953 85. 1 [SPCOMCGEN 
[1550802] 

gi|47457|cmb!X01 137.1JSPTN917B [47457] 

gi|975714|cmb|X90941.1|SPTRJ5251 [975714] 

gi|975713|emb|X90940.1|SPTCJ5251 [975713] 

gi|975709|cmb|X90939. 1 (SPDNATETM [975709] 

gij 1 524346|embjZ7969 1 . 1 [SOORFS [ 1 524346] 

gi|1553054|emb|X98364.1|SPPBPHU9 [1553054] 

gi|1553052|cmb|X98367.1|SPPBPHU13 [1553052] 

gi|1553050|erab|X98366.1|SPPBPHU12 [1553050] 

gijl553048jemb|X98365.1|SPPBPHUl 1 [1553048] 

gi| 1 575029|gb|U53509. 1 |SPU53509 [1575029] 

gi|15429681gb|U49O88.1|SPU49088 [1542968] 

gi|1542966|gb|U49087.1|SPU49087 [1542966] 

gi| 1536961 |emb| Y07845. 1 |SPGYRA [1536961 ] 

gi|47391|emb|X16367.1|SPPBPX [47391] 

gij 1490398|emb|Z67739. 1 [SPPARCETP [ 1 490398] 

gi|l490395|emb|Z67740.1|SPGYRBORF 
[1490395] 

gi| 143 1 589|cmb|Z74777. 1 |SPTMRDHFR 
[1431589] 

gi|408145|emb|Z21702.1|SPUNGMUTX [408145] 
gii47461|emb|X61025.1|SPXISINT [47461] 
gi[47459|emb|X55651.1|SPUNGG [47459] 
gi|47454|embpC52632.1|SPT1545E [47454] 
gi|47421|emb|Z17307.1|SPRECA [4J42fl " 
gif474 1 9|emb|X67873 . 1 |SPP0NA8 [474 1 9] 
gi|474 1 7jcmb|X67872.1 |SPP0NA7 [4741 7] 
gi|474 1 5|emb|X67871 . l|SPPONA6 [474 1 5] 
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gi|474 1 3|emb|X67870. 1 [SPPONA5 [47413] 
gi|4741 l|emb|X67869.1|SPPONA4 [4741 1] 
gi|47409|emb|X67867.1|SPPONA2 [47409] 
gi|47407|cmb|X67866.1|SPPONAl [47407] 
gi|47405|crab|X67868.1|SPPNA3 [47405] 
gi|47403|emb|X52474.1|SPPLY [47403] 
gi|984232|emb!Xl 6022. 1 (SPPENA [984232] 
gi|517190|emb|X782l5.I|SPPBPXG [517190] 
gi|295840|emb|Z22230. 1 (SPPBP2BBA [295840] 
gi|28898 1 |emb|Z22 1 85. 1 {SPPBP2BAC [28898 1 ] 
gi|288979|emb|Z22 1 84. 1 JSPPBP2BAB [288979] 
gi|288466|cmb|Z2 198 1. 1 |SPPBP2BAA [288466] 
gij49390|cmb|Z21813.1|SPPBP2XD [49390] 
gi!49389|emb|22 18 12. 1 |SPPBP2XC [49389] 
gi|49387jemb|Z2 1811.1 |SPPBP2B J [49387] 
gi|49385|cmb|221810.1JSPPBP2BI [49385] 
gi|49382jemb|221808.1|SPPBP2BH [49382] 
gi|49380|emb]Z21807.t}SPPBP2BG [49380] 
gi|49379jemb|221806.1|SPPBP2BF [49379] 
gi|49377|cmb|Z21805.1jSPPBP2BE [49377] 
gil49376|emb|Z21804.1|SPPBP2XB [49376] 
git49375|emb|Z21803.1|SPPBP2XA [49375] 
gii49374|cmb|Z21802.1|SPPBP2BD [49374] 
gi|49372|cmb|Z21801.1|SPPBP2BC [49372] 
gii49369|cmb|Z2 1799. 1 |SPPBP2BA [49369] 
gi|47399|emblX13137.1|SPPENASE [47399] 
gi147397|emb[X13136.1|SPPENARE [47397] 
gi|1052802|exnb|X83917.1|SPGYRBG [1052802] 
gi|587550|cmb|X72967.1|SPNANA [587550] 
gil49384|cmb|Z21809.1|SPPBPlAB [49384] 
gi|49371|cmb[Z2180O.l[SPPBPlAA [49371] 
gi|9842281cmblZ49094.1|SPCS1091A [984228] 
gi|473721cmb|X54225.1|SPENDA [47372] 
gi|806590|emblZ49246.1[SP667SOD [806590] 
giJ407172|cmb|Z26851.1(SPATPAS2 [407172] 
gi|407166|emb|Z26850. 1 |SPATPAS I [407166] 
gi|47353|cmb|X63602.1|SPBOX [47353] 
gi|47348|emb|X05577.1(SPAPHA3 [47348] 
gi{47337|«nb|X65132.1|SP824PBPX [47337] 
gi|47335|emb|X65134. 1JSP669PBPX [47335] 



gi|4733I|emb|X65133.1|SP577PBPX [47331] 
gi|559527|cmb|X65136.1|SP110PBPX [559527] 
gi]31 14l5|cmb|Z22807.1|SP16SRNAA (31 1415] 
gi|47329|emb|X65135.1|SP531PBPX [47329] 
gi|47307|emb(X65l31.1|SP290PBPX [47307] 
gi|47295|emb|X583 1 2. 1 |SP1 6SRNA [47295] 
gil854614|emb|Z49109. IjSPGADAGN [854614] 
gi|556428fgb|L36660.1|STRORFI (556428] 
gi|51 1062|emb|Z35135.I|SPALIAG [5 1 1062] 
gi| 1 208737|gb|U47625. 1 (SPU47625 [1 208737] 
gi[530062|gb|U12567.1{SPU12567 [530062] 
gijl53656|gb|M29686.1|STRHEXB [153656] 
gi|153654|gb|M18729.1JSTRHEXA [153654] 
gi| 1 53608|gb|Ml 4339. 1 JSTRDPN2A [ 1 53608] 
gi)153605|gb]M14340.1)STRDPNlA [153605] 
gij643543|gb|U20084.1|SPU20084 [643543] 
gi|643541(gb|U20083.1|SPU20083 [643541] 
gi|643539!gb|U20082. 1JSPU20082 [643539] 
gi|643537|gb|U20081.1|SPU20081 [643537] 
gi|643535|gb]U20080.1|SPU20080 [643535] 
gi|643533|gb|U20079.1|SPU20079 [643533] 
gi|643531|gbfU20O78.1|SPU20O78 [643531] 
gi|6435291gb|U20077.1tSPU20077 [643529] 
gi|6435271gb|U20076.1|SPU20076 [643527] 
gi|6435251gb[U20075.1lSPU20075 [643525] 
gi|643523[gb|U20O74.1|SPU20074 [643523] 
gi|643521(gb[U20073.1|SPU20073 [643521] 
gi[643519lgb|U20072.1|SPU20072 [643519] 
gil643517|gb|U20071.1|SPU2007l [6435171 
gi|643515|gb|U20070.1[SPU20070 [643515] 
gi|643513|gbJU20O69.1[SPU20069 [643513] 
gii64351l|gb|U20O68.1|SPU20068 [643511] 
gi|643509|gb|U20067.1!SPU20067 [643509] 
gi| 1 0 l7802|gb|U37560. 1 |SPU37560 [101 7802] 
gi|663277|gb|M36180.1iSTRCOMAA [663277] 
gi(437704{gb|U0670.1|STRHYALURO [437704] 
gi| 1 53849|gb|L0775 1 . 1 fTRNTNS253R7f53 849] 
gi| 1 53855|gb|M255 1 9. 1 (STRVA 1 [ 1 53855] 
gi| 1 53853|gbjM802 15. 1 |STRUVS402A [ 1 53853] 
gi|153848|gb|L07750.1|STRTN5252L [153848] 
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gij 1 538401gb|M74 122. 1 jSTRSURPROA [ 1 53840] 
gi| 1 53796|gb(M60763. I jSTRRRNAA [ 1 53796] 
gi|153791|gb|M31296.1|STRRECP [153791] 
gi[5 1 6639|gb|L205 56. 1 jSTRPLPA [5 16639] 
gt| 1 53783Jgb|M28679. 1 jSTRPROMB ( 1 53783] 
gij 1 53782fgb|M28678. 1 |STRPROMA [ 1 53 782] 
giil53766jgb|M90527.1|STRPONA [153766] 
gi|153764|gb|J04479.1|STRPOLA [153764] 
git 153752|gb|M255 15.1 |STRNG4369 [ 1 53752] 
gi|153722|gb|L08611.t|STRMLTODX [153722] 
gij 153702|gb|J0 1 796. 1 jSTRMALMXP [ 1 53702] 
gi}153701|gb|J01795.1|STRMALMX[153701J 
gill53693|gb|M138l2.1|STRLYTPN [153693] 
gi] 1 53691 |gb|M 1 77 1 7. 1 }STRLYS [153691] 
gi|153667|gb|M25525. l|STRKAG73 [ 1 53667] 
gi|398102!gb|L20564.1|STREXP9B [398102] 
gi|398100|gb)U0563.1|STREXP9A [398100] 
gi|398O98|gb|L20562.1|STREXP8A [398098] 
gi|398096|gb|L20561.1jSTREXP7A [398096] 
gi|398094|gb|L20560.t|STCEXP6A [398094] 
gi|398Q92|gb|L20559.1|STREXP5A [398092] 
gii398090|gb|L20558.1|STOEXP4A [398090] 
gijl53626|gbfJ04234.1|STREXOA [153626] 
gii!53612|gblM11226.1|SnU>PNM [153612] 
gi|153603|gbJM2552U|STRDN87669 [153603] 
g i|l53601|gbjM25526.1jSTRDN87S77 [153601] 
gi|153599|gblM25522.1|STRDN179 [153599] 
gitl53594|gbjM37688.11STRDACA [153594} 
gi|153582|gbfL07752.1|STRATTB [153582] 
gi(4665141gb|U1413.IiSTRlRRA [466514] 
giJI 5355 i |gblM25520. 1 [STR8249 [I5355I] 
gi]153549|gb|M25524. 1 |STR53 13972 [153549] 
gi|153547|gb|M25517.1|STR29044 [153547] 
gi|l 53545jgb|M25523. 1 |STR1 8 1071 [153545] 
gi|153541|gb|M25518.1|STR121 [153541] 
gi|153539|gb|M255!6.t|STRl 10K70 [153539] 
gi|506632jgb|U04047.1 |SPU04047 [506632] 
gi|393267|gb|L19055.1 ISTRPAPA [393267] 
gi(442066|gblS62272.11S62272 [442066] 
gif295l91tgb]L15190.1|STRPURISYN [295191] 
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CLAIMS 

What is claimed is: 

5 LA method for identifying a bacteriophage coding region encoding a 

product active on an essential bacterial target, comprising identifying a nucleic acid 
sequence encoding a gene product which provides a bacteria-inhibiting function when 
said bacteriophage infects a host bacterium, 

wherein said bacteriophage is uncharacterized and said host bacterium 
10 is a pathogenic bacterium. 

2. The method of claim 1 , further comprising expressing a recombinant 
bacteriophage ORF in cells of a bacterial strain, wherein inhibition of said cells 
following expression of said ORF is indicative that said product is active on an 

15 essential bacterial target. 

3. The method of claim 2, wherein inhibition of said bacterium following 
expression of said ORF is determined by comparison with the growth or viability of 
said bacterium following expression of an inactivated mutant form of said ORF or in 

20 the absence of expression of said ORF, and wherein inhibition of said bacterium 
following expression of said ORF is indicative that said product is active on an 
essential bacterial target. 

4. The method of claim 2, wherein expression of said ORF is inducible. 

25 

5 . The method of claim 1 , further comprising sequencing at least a 
portion of a bacteriophage genome. 

6. The method of claim 1 , wherein at least a portion of the nucleotide 
30 sequence of a bacteriophage genome is known, said method further comprising 

identifying at least one ORF in said portion by computer analysis of said sequence. 

7. The method of claim 6, further comprising analyzing the sequence of 
said at least one ORF or of a polypeptide encoded by said ORF to identify 

35 homologous genes or gene products of known biochemical function, thereby- 
indicating the biochemical function of said polypeptide. 
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8. The method of claim 7, wherein said homologous gene or gene product 
is a bacterial gene important for cell viability. 

9. The method of claim 7, wherein said homologous gene or gene product 
5 is a gene or gene product known to have a bacteria-inhibiting function. 

1 0. The method of claim 6, further comprising analyzing the sequence of 
said at least one ORF or of a polypeptide encoded by said ORF to identify structural 
motifs in said polypeptide, thereby indicating the cellular function of said polypeptide. 

10 

1 1 . The method of claim 1 , wherein a host bacterium for said 
bacteriophage is selected from the species group consisting of bacteria listed in Table 

1. 

15 12. The method of claim 1 , wherein said bacteriophage is selected from the 

group consisting of uncharacterized bacteriophage listed in Table I. 

1 3 . The method of claim 2, wherein a plurality of bacteriophage ORFs are 
expressed in at least one bacterium. 



20 



14. The method of claim 1 3, wherein each of said plurality of 
bacteriophage ORFs is expressed in a different bacterium. 



1 5 . The method of claim 14, wherein said plurality of bacteriophage ORFs 
25 comprises at least 1 0% of the ORFs in the genome of said bacteriophage. 

16. The method of claim 1, wherein said pathogenic bacterium is an animal 
pathogen. 

30 1 7. The method of claim 1 6, wherein said pathogenic bacterium is a human 

pathogen. 

18. The method of claim 1, wherein said pathogenic bacterium is a plant 
pathogen. ... 

35 

19. The method of claim 1, further comprising confirming the inhibitor 
function of said ORF. 
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20. The method of claim 1 9, wherein said confirming comprises 
expressing a loss-of-function mutant form of said ORF in said host bacterium. 

5 21. The method of claim 1 , wherein said identifying a nucleic acid 

sequence encoding a gene product active on an essential bacterial target comprises 
identifying a nucleic acid sequence encoding a homolog of a bacteriophage 
polypeptide known to be active on an essential bacterial target. 

1 0 22. The method of claim 1 , wherein said identifying a bacteriophage 

coding region comprises identifying a first coding region from a bacteriophage having 
a non-pathogenic host bacterial strain related to said pathogenic bacterium, said first 
coding region encoding a product active on an essential bacterial target; and 
identifying a homolog of said first coding region, wherein said 

1 5 homolog is a probable said bacteriophage coding region encoding a product active on 
an essential bacterial target. 

23. The method of claim 2, wherein a plurality of bacteriophage ORFs 
from a plurality of different bacteriophage are expressed in at least one bacterium. 

20 

24. The method of claim 23, wherein each of said plurality of 
bacteriophage ORFs are expressed in different bacteria. 



25 25. A method for identifying a target for antibacterial agents, comprising 

determining the bacterial target of an uncharacterized bacteriophage inhibitor protein. 

26. The method of claim 25, wherein said determining comprises 
identifying at least one bacterial protein which binds to said bacteriophage inhibitor 

30 protein or a fragment thereof. 

27. The method of claim 26, wherein said binding is determined using 
affinity chromatography on a solid matrix. 



35 
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29. The method of claim 28, wherein said genetic screen is a yeast two- 
hybrid screen. 

30. The method of claim 25, wherein said determining comprises a co- 
5 immunoprecipitation assay or a protein-protein crosslinking assay. 

3 1 . The method of claim 25, wherein said determining comprises 
identifying a mutated bacterial coding sequence which protects a bacterium from said 
bacteriophage inhibitor. 

10 

32. The method of claim 25, wherein said determining comprises 
identifying a bacterial coding sequence which protects a bacterium against said 
bacteriophage inhibitor when expressed at high levels in said bacterium. 

1 5 33. The method of claim 25, wherein said determining further comprises 

identifying a bacterial nucleic acid sequence encoding a polypeptide target of said 
bacteriophage inhibitor protein. 

34. The method of claim 33, wherein said nucleic acid sequence is 

20 identified by determining at least a portion of the amino acid sequence of a bacterial 
protein target, and identifying a bacterial nucleic acid sequence which encodes said 
protein target. 

35. The method of claim 25, wherein said bacterial target is naturally 
25 produced by a bacterial species selected from the group consisting of species of the 

genera listed in Table 1 . 

36. The method of claim 25, wherein said bacterial target is naturally 
produced by a bacterial strain selected from the group consisting of species listed in 

30 Table 1. 

37. The method of claim 25, wherein said inhibitor protein is naturally 
produced by a bacteriophage selected from the group consisting of uncharacterized 
bacteriophage listed in Table 1 . 

35 

38. The method of claim 25, further comprising identifying a , 
bacteriophage ORF which encodes a product having a bacteria-inhibiting function. 
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39. The method of claim 38, wherein said identifying a phage ORF 
comprises expressing at least one bacteriophage ORF in a bacterium, wherein 
inhibition of said bacterium following said expression is indicative that said ORF 

5 encodes a bacteria-inhibiting function. 

40. The method of claim 39, wherein a plurality of bacteriophage ORFs are 
expressed in at least one bacterium. 

10 41. The method of claim 40, wherein each of said plurality of 

bacteriophage ORFs is expressed in a different bacterium. 

42. The method of claim 41, wherein said plurality of bacteriophage ORFs 
comprises at least 10% of the ORFs in the genome of said bacteriophage. 

15 

43. The method of claim 25, wherein said detennining the bacterial target 
of a bacteriophage inhibitor protein is performed for a plurality of different 
bacteriophage of the same host bacterium. 

20 44. The method of claim 25 , wherein said bacterial target originates from 

an animal pathogen. 

45. The method of claim 44, wherein said bacterial target is a gene 
homologous to a gene from an animal pathogen. 

25 

46. The method of claim 44, wherein said pathogen is a human pathogen. 

47. The method of claim 25, wherein said bacterial target originates from a 
plant pathogen. 

30 

48. The method of claim 25, wherein said bacterial target is a gene 
homologous to a gene from a plant pathogen. 

49. The method of claim 25, further comprising determining the cellular or . . 
35 biochemical function or both of said inhibitor protein. 
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50. The method of claim 25, wherein said identifying the bacterial target 
comprises identifying a phage-specific site of action. 

5 5 1 . An isolated, purified, or enriched nucleic acid sequence at least 1 5 

nucleotides in length, wherein said sequence corresponds to at least a portion of a 
bacteriophage sequence, and wherein said bacteriophage is selected from the group 
consisting of Staphylococcus aureus bacteriophage 77, 3A, 96, and 44AHJD, 
Enterococcus baceriophage 182, and Streptococcus pheumoniae bacteriophage Dp-1. 

10 

52. The nucleic acid sequence of claim 5 1 , wherein said sequence 
comprises at least 50 nucleotides. 

53 . The nucleic acid sequence of claim 5 1 , wherein said nucleic acid 

15 sequence corresponds to at least a portion of a nucleic acid sequence which encodes a 
product which provides a bacteria-inhibiting function. 

54. The nucleic acid sequence of claim 53, wherein said nucleic acid 
sequence encodes a polypeptide which provides a bacteria-inhibiting function. 

20 

55. The nucleic acid sequence of claim 54, wherein said nucleic acid 
sequence is transcriptionally linked with regulatory sequences enabling induction of 
expression of said sequence. 

25 

56. An isolated, purified, or enriched polypeptide comprising at least a 
portion of a protein providing a bacteria-inhibiting function, wherein said polypeptide 
is normally encoded by a bacteriophage selected from the group consisting of 
Staphylococcus aureus bacteriophage 77, 3 A, 96, and 44AHJD, Enterococcus 

30 baceriophage 1 82, and Streptococcus pheumoniae bacteriophage Dp- 1 , 

57. The polypeptide of claim 56, wherein said polypeptide provides said 
bacteria-inhibiting function. 

35 58. The polypeptide of claim 56, wherein said polypeptide comprises a 

portion at least 10 amino acid residues in length of a said polypeptide normally 
encoded by said bacteriophage. 
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59. A recombinant vector comprising a bacteriophage ORF corresponding 
to an ORF from a bacteriophage having a pathogenic bacterial host, wherein said 

5 bacterial host is selected from the group consisting of uncharacterized bacteria of 
Table 1. 

60. The vector of claim 59, wherein said vector is an expression vector. 

10 61. The vector of claim 59, wherein said bacteriophage is selected from the 

group consisting of uncharacterized bacteriophage of Table 1. 

62. The vector of claim 61 , wherein said bacteriophage is selected from the 
group consisting of Staphylococcus aureus bacteriophage 77, 3A, 96, and 44AHJD, 

1 5 Enterococcus baceriophage 1 82, and Streptococcus pheumoniae bacteriophage Dp- 1 . 

63 . The vector of claim 60, wherein expression of said ORF is inducible. 



20 64. A recombinant cell comprising a vector, wherein said vector comprises 

an ORF from a bacteriophage having a pathogenic bacterial host, wherein said 
bacterial host is selected from the group consisting of bacterial species of Table 1 . 

65. The recombinant cell of claim 64, wherein said bacteriophage is 
25 selected from the group consisting of uncharacterized phage of Table 1 . 

66. The cell of claim 65, wherein said bacteriophage is selected from the 
group consisting of Staphylococcus aureus bacteriophage 77, 3 A, 96, and 44AHJD, 
Enterococcus baceriophage 1 82, and Streptococcus pheumoniae bacteriophage Dp- 1 . 

30 

67. The cell of claim 64, wherein said vector is an expresssion vector and 
expression of said ORF is inducible. 

35 68. A method for identifying an antibacterial agent, comprising idenlffying 

an active portion of a product of a bacteria-inhibiting ORF of a bacteriophage. 
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69. The method of claim 68, further comprising constructing a synthetic 
peptidomimetic molecule, wherein the structure of said molecule corresponds to the 
structure of said active portion. 



5 



70. A method for identifying a compound active on a target of a 
bacteriophage inhibitor protein, comprising the step of 

contacting a bacterial target protein with a test compound; and 
determining whether said compound binds to or reduces the level of 
1 0 activity of said target protein, 

wherein binding of said compound with said target protein or a 
reduction of the level of activity of said protein is indicative that said compound is 
active on said target and wherein said target is uncharacterized. 

15 71. The method of claim 70, wherein said contacting is carried out in vitro. 

72. . The method of claim 70, wherein said contacting is carried out in vivo 
in a cell. 

20 73 . The method of claim 70, wherein said compound is a small molecule. 

74. The method of claim 70, wherein said compound is a peptidornimetic 
compound. 

25 75 . The method of claim 70, wherein said compound is a fragment of a 

bacteriophage inhibitor protein. 

76. The method of claim 70, further comprising determining the site of 
action of said compound on said target protein. 



30 



77. The method of claim70, wherein said contacting is performed for a 
plurality of said target proteins. 



35 78. A method of screening for potential antibacterial agents, comprising 

the step of determining whether any of a plurality of compounds is active on a target 
of a bacteriophage inhibitor protein, 
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wherein said target is naturally produced by a pathogenic bacterium. 

79. The method of claim 78, wherein said plurality of compounds are 
small molecules. 

5 

80, The method of claim 78, wherein said determining is performed for a 
plurality of said targets. 



10 81. A method for inhibiting a bacterium , comprising the step of; 

contacting said bacterium with a compound active on a target of a 
bacteriophage inhibitor protein, wherein said target or the target site is 
uncharacterized. 

15 82. The method of claim 8 1 , wherein said compound is said protein or an 

active fragment thereof 

83. The method of claim 81 , wherein said compound is a structural 
mimetic of said protein. 

20 

84. The method of claim 8 1 , wherein said compound is a small molecule. 

85 . The method of claim 8 1 , wherein said contacting is performed in vitro. 

25 86. The method of claim 8 1 , wherein said contacting is performed in vivo 

in an animal. 

87. The method of claim 86, wherein said animal is a human. 

30 88. The method of claim 8 1 , wherein said contacting is carried out in vivo 

in a plant 

89. The method of claim 81 , wherein said bacterium is selected from the 
group of bacteria listed in Table 1. 

35 
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90. A method for treating a bacterial infection in an animal suffering from 
an infection, comprising administering to said animal a therapeutically effective 
amount of compound active on a target of a bacteriophage inhibitor protein in a 
bacterium involved in said infection, 

5 wherein said target is an uncharacterized target or the compound is active at an 

uncharacterized target site. 

91 . The method of claim 90, wherein said compound is a small molecule. 

10 92. The method of claim 90, wherein said compound is a peptidomimetic 

compound. 

93. The method of claim 90, wherein said compound is a fragment of a 
bacteriophage inhibitor protein. 

15 

94. The method of claim 90, wherein said animal is a mammal. 

95. The method of claim 94, wherein said mammal is a human. 

20 96. The method of claim 90, wherein said bacterium is selected from the 

group listed in Table 1. 

97. The method of claim 90, wherein said bacteriophage inhibitor protein 
is from a bacteriophage selected from the group of bacteriophage listed in Table 1 . 

25 

98. A method for propylactically treating an animal at risk of an infection, 
comprising administering to said animal a prophylactically effective amount of a 
compound active on a target of a bacteriophage inhibitor protein, 

30 wherein said target is an uncharacterized target or the site of action of 

said compound is an uncharacterized target site. 

99. The method of claim 98, wherein said compound is a small molecule. 

35 100. The method of claim 98, wherein said compound is a peptidomimetic 

compound. 
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101 . The method of claim 98, wherein said compound is a fragment of a 
bacteriophage inhibitor protein. 

102. The method of claim 98, wherein said animal is a mammal. 

5 

103. The method of claim 1 02, wherein said mammal is a human. 

- 104. An antibacterial agent active on a target of a bacteriophage inhibitor 
1 0 protein, wherein said target is an uncharacterized target or said agent is active at a 
phage-specific site on said target. 

105. The agent of claim 104, wherein said agent is a pepetidomimetic of a 
bacteriophage inhibitor polypeptide. 

15 

106. The agent of claim 104, wherein said agent is a small molecule. 

107. The agent of claim 1 04, wherein said agent is a fragment of a 
bacteriophage inhibitor polypeptide. 

20 

108. The agent of claim 104, wherein said agent is active at a phage-specific 
site on said target. 

25 1 09. A method of making an antibacterial agent, comprising the steps of: 

a) identifying a target of a bacteriophage inhibitor polypeptide; 

b) screening a plurality of test compounds to identify a compound 
active on said target; and 

c) synthesizing said compound in an amount sufficient to provide a 
30 therapeutic effect when administered to an organism infected by a bacterium naturally 

producing said target. 

1 10. The method of claim 109, wherein said compound is a small molecule. 

35 111. The method of claim 1 09, wherein said compound is a peptidomlmetic 

compound. 
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1 1 2. The method of claim 1 09, wherein said compound is a fragment or 
derivative of a bacteriophage inhibitor protein. 

5 1 13. A computer readable device having recorded therein a nucleotide 

sequence of a portion of at least one bacteriophage genome of Staphylococcus aureus 
bacteriophage 77, bacteriophage 3 A, or bacteriophage 96, a nucleotide sequence at 
least 95% identical to a said nucleotide sequence, a ribonucleic acid equivalent, a 
degenerate equivalent, a homologous sequence, or at least one amino acid sequence 
1 0 encoded by said nucleotide sequence; and 

a nucleotide sequence or amino acid sequence analysis program, 
wherein said program can perform at least one sequence analysis on said 
nucleotide or amino acid sequence. 

15 114. The device of claim 113, wherein said at least a portion of at least one 

bacteriophage genome comprises at least one ORF. 

115. The device of claim 113, wherein said device comprises a medium 
selected from the group consisting of floppy disk, computer hard drive, optical disk, 

20 computer random access memory, and magnetic tape wherein said nucleotide or 
amino acid sequence or said program or both are recorded on said medium. 

116. The device of claim 113, wherein said portion of at least one 
bacteriophage genomic nucleotide sequence comprises at least 50% of at least one 

25 bacteriophage genomic sequence. 

117. The device of claim 113, wherein said at least one bacteriophage 
nucleotide genomic sequence comprises portions of a plurality of bacteriophage 
nucleotide genomic sequences. 

30 

118. A computer-based system for identifying biologically important 
portions of a bacteriophage genome, comprising: 

a) a data storage medium having recorded thereon a nucleotide sequence 
35 corresponding to a portion of at least one bacteriophage genome, wherein said 
bacteriophage genome is uncharacterized; 
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b) a set of instructions allowing searching of said sequence to analyze said 
sequence; and 

c) an output device. 

5 119. The system of claim 1 1 8, wherein said output device comprises 

comprises a device selected from the group consisting of a printer, a video display, 
and a recording medium. 

120. The system of claim 1 1 8, wherein said bacteriophage genome is of a 
1 0 bacteriophage selected from the group consisting of uncharacterized bacteriophage 

listed in Table 1. 

121. The system of claim 118, wherein said uncharacterized bacteriophage 
is selected from the group consisting of bacteriophage 77, 3 A, and 96. 

15 

1 22. A method for identifying or characterizing a bacteriophage ORF , 
comprising the steps of: 

a) providing a computer-based system for analyzing nucleic acid or 
20 amino acid sequence data, wherein said system comprises a data storage medium 

having recorded thereon at least one nucleotide or amino acid sequence corresponding 
to a portion of at least one uncharacterized bacteriophage genome, a set of instructions 
allowing searching of said sequence to analyze said sequence; and an output device; 

b) analyzing at least a portion of at least one said sequence; and 
25 c) outputting results of said analyzing to said output device. 

1 23. The method of claim 1 22, wherein said analysis identifies sequence 
similarity or homology with sequences selected from the group consisting of bacterial 
ORFs encoding products with related biological function; ORFs encoding known 

30 inhibitors or bacteria, essential bacterial ORFs. 

124. The method of claim 122, wherein said analysis comprises identifying 
a probable biological function based on identification of structural elements or 
sequence homology or similarity. ... 

35 

125. The method of claim 122, wherein said bacteriophage is selected from 
the group consisting of uncharacterized bacteriophage listed in Table 1. 
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126. The method of claim 125, wherein said uncharacterized bacteriophage 
is selected from bacteriophage 77, 3A, and 96. 
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FIG. IB. 




PCRof pT002( wiih XhoF I BomHNR 
AAA 

l 



Xhof-S'-AATT CTCGAGT AAAATAACAT-S 
Xhoi 

Hind III 



AAATCAGGTGACTGTTGAGAAAAGGAGGCGGATCCCG-BamHNR 
Stop of RBS BamHI 
arsR 



Digestion with 
Xhol & BamHI 



Modified between stop 
of arsR to BamHI 



Ligation 



PCR of pT0021 with LucFFB & LucFFH 

LucFFB *5'-CGGGATCCATGAGGGG7TCCGAAGACG 
Start of Original BamHI 



BamHI 



was modified 



GAAAGTCCAAATTGIAAGCTTGGG-LucFFH 




Modified in the 
vicinity of BamHI 

Cloning site for ORFs: 
BamHI & Hindi!! 

No additional codons 
in the induced protein 



P arsR , , LucFF 

CTCGAG— EE! (TGAlGAAAAGGAGGCGGATCClATGl iTAAGl CTT 

Xnot RBS SamHI HindlU 
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FIG. 2. 




Verification of pTHA/ORF clones 
by PCR and sequencing 
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FIG. 3. 



(A) Functional assay on semi-solid support media 



Frozen stock of phage 77 pTHA/ORF S. aureus RN4220 transiormants 



1:10 and 1:100 dilution in saline solution 
5 u,l of 1 : 1 0 dilution 3 p.l of 1:10 and 1:100 dilution 



Streak onto agar plates containing 
0, 2.5, 5, and 7.5 u.M NaAs02 



Spot onto agar plates containing 
0, 2.5, 5, and 7.5 \M NaAs02 



Q/N, 37'C 

Compare bacterial growth on plates with and without NaAs0 2 



(B) Functional assay in liquid medium 

O/N culture inoculated from frozen stock of 

phage 77 pTHA/ORF S, aureus RN4220 transformants 

i 

1:100 dilution of O/N culture 

|2h, 37'C, 250 rpm 

Fresh culture 
|l50uJ 

2.5 ml containing 0 and 5 uM NaAs0 2 

1 3-5 h, 37'C, 250 rpm 




Measure OD 565 1:10 serial dilution from 10*' to 10* 6 

. | 20 m.1 of 10* 4 Jo 10* . . 

Spot onto agar plate 
| O/N, 37*C 
Count colonies 
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FIG, 7 

Abbreviations: 

kan: gene encoding kanamycin resistance 

cat: gene encoding chloramphenicol resistance 

ori + and -: origin of replication in gram-positive and 

gram-negative bacteria, respectively 

arsR: gene encoding regulatory protein of the ars promoter 

P: ars promoter 

lucFF: gene encoding luciferase protein. This portion will 
be removed and replaced by individual S. aureus phage 
genes. 

Reference: 

Tauriainen et a)., Appl. Environ. Microbio. 1997. 63: 4456- 
4461 
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